Changes

GPU621/Apache Spark

36 bytes added, 09:52, 30 November 2020

→‎|300px vs |200px

# Daniel Park

= Apache Spark vs Apache Hadoop =[[File:~~Hadooplogo~~Apache Spark logo.svg.png||~~300px~~200px]] '''vs ''' [[File:~~Apache Spark logo.svg~~Hadooplogo.png||~~200px~~300px]] =

'''MapReduce''' was famously used by Google to process massive data sets in parallel on a distributed cluster in order to index the web for accurate and efficient search results. '''Apache Hadoop''', the open-source platform inspired by Google’s early proprietary technology has been one of the most popular big data processing frameworks. However, in recent years its usage has been declining in favor of other increasingly popular technologies, namely '''Apache Spark'''.

This project will focus on demonstrating how a particular use case performs in Apache Hadoop versus Apache spark, and how this relates to the rising and waning adoption of Spark and Hadoop respectively. It will compare the advantages of Apache Hadoop versus Apache Spark for certain big data applications.

= Apache Hadoop =

Abalachandran7

73

edits

Changes

GPU621/Apache Spark

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools