92
edits
Changes
→Apache Spark Introduction
==Apache Spark Introduction==
[[file: Spark_2022.png|600px]]
Apache Spark is an open source cluster computing framework pioneered by Matei Zaharia at the University of California, Berkeley's AMPLab in 2009 and released open source in 2010 under the BSD license.Spark uses in-memory computing technology to analyze data in memory while it is still being written to the hard disk. Spark allows users to load data into cluster memory and query it multiple times, making it ideal for machine learning algorithms.
==Spark features==