Changes

GPU621/Spark

No change in size, 15:44, 24 November 2016

→‎HISTORY

Spark is Big Data framework for large scale data procesing. It provides an API centred on a data structure called the Resilient Distributed Dataset (RDD). It provides a read only, fault tolerant multiset of data items distributed over a cluster of machines. High-level APIs are available for Scala, Java, Python, and R. This tutorial focuses on Python code for its simplicity and popularity.

=== ~~HISTORY~~ History ===

Spark was developed in 2009 at UC Berkeleys AMPLab. It was open sourced in 2010 under the BSD license. As of this writing (November 2016), it's at version 2.02.

Nascherman

27

edits

CDOT Wiki β

Changes

GPU621/Spark

CDOT Wiki ^β