Changes

GPU621/Apache Spark

765 bytes added, 17:03, 9 November 2020

→‎GPU621/Apache Spark

'''Description'''

~~Comparing~~ The common MapReduce parallel programming we have covered in this course was arguably made famous by Google. It was used by the company to process a massive data set in parallel to index the web for accurate and efficient search results. Apache Hadoop, the open source platform inspired by Google’s early proprietary technology has been one of the most popular big data processing frameworks. However, in recent years its usage has been declining in favor of other increasingly popular technologies, namely Apache spark. We will introduce the history and advantages (scalability, flexibility, resilience) that led to the popularization of Apache Hadoop for certain big data applications. Our project will focus on demonstrating how a particular use case performs in Apache ~~Spark with MapReduce~~Hadoop versus Apache spark.

== Group Members ==

DanielPark

76

edits

CDOT Wiki β

Changes

GPU621/Apache Spark

CDOT Wiki ^β