76
edits
Changes
m
no edit summary
=== Compatibility ===
Spark can run as a standalone application or on top of Hadoop YARN or Apache Mesos. Spark supports data sources that implement Hadoop input format, so it can integrate with all the same data sources and file formats that Hadoop supports.
[[File:Hadoop-vs-spark.png|upright=2|right||300px]]
=== Data Processing ===
In addition to plain data processing, Spark can also process graphs, and it also has the MLlib machine learning library. Due to its high performance, Spark can do both real-time and batch processing. However, Hadoop MapReduce is great only for batch processing.