Open main menu

CDOT Wiki β

Changes

GPU621/Apache Spark Fall 2022

618 bytes added, 20:52, 3 December 2022
Apache Spark Core API
// map from one RDD to another RDD
JavaRDD<Double> mapRDD = javaRDD.map(value -> Math.sqrt(value));
// 3.3166247903554, 4.69041575982343, 5.744562646538029, 6.6332495807108
mapRDD.foreach(value->System.out.println(value));
 
1.2 filter(func)
 
filter() is used when we only want some elements that meet the conditions. When we use this function, we need to pass another predicate function.
 
//create input data list
List<Integer> inputData = new ArrayList<>();
inputData.add(11);
inputData.add(22);
inputData.add(33);
inputData.add(44);
 
// create RDD
JavaRDD<Integer> javaRDD = sc.parallelize(inputData);
//use filter
JavaRDD<Double> filterRDD = javaRDD.filter(value ->value>=30);
// 33 44
filterRDD.foreach(value->System.out.println(value));
==Deploy Apache Spark Application On AWS==
92
edits