Open main menu

CDOT Wiki β

Changes

GPU621/Apache Spark Fall 2022

379 bytes added, 01:30, 7 December 2022
no edit summary
===5. GraphX===
GraphX is a distributed graph processing framework on Spark. It provides a set of APIs for expressing graph computations and can emulate Pregel abstraction. graphX also provides optimized runs for this abstraction.
 
==Spark Application==
 
===1: Suitable for complex batch processing, such as Batch Data Processing===
This type of processing focuses on the ability to process massive amounts of data, not the speed of processing. Therefore, the general processing time of this type is usually from minutes to several hours. A similar situation is the MapReduce computing method used by hadoop.
==Apache Spark Core API==
10
edits