Open main menu

CDOT Wiki β

Changes

GPU621/Apache Spark Fall 2022

27 bytes added, 22:16, 3 December 2022
Useful Case
//map to only words
JavaRDD<String> wordsRDD = removeBlankLineRDD.flatMap(sentence -> Arrays.asList(sentence.split(" ")).iterator());
//create pairRDD
JavaPairRDD<String, Long> pairRDD = wordsRDD.mapToPair(word -> new Tuple2<>(word, 1L));
//get frequency
//make frequency as the key
JavaPairRDD<Long, String> reversedMap = totals.mapToPair(t -> new Tuple2(t._2, t._1));
//sortthe rdd
List<Tuple2<Long, String>> results = reversedMap.sortByKey(false).collect();
//printout the result
results.forEach(t->System.out.println(t));
//close
92
edits