Changes

Jump to: navigation, search

GPU621/Apache Spark Fall 2022

3 bytes added, 21:57, 3 December 2022
Deploy Apache Spark Application On AWS
==Deploy Apache Spark Application On AWS==
 
Amazon EMR is a cloud big data solution for petabyte-scale data processing, interactive analytics, and machine learning using open-source frameworks such as Apache Spark, Apache Hive, and Presto provided by AWS cloud service. EMR is easy to use and it has low cost, so it’s a great start for spark beginners.
===Prerequisite===
From here, I will assume you have an AWS service account and that you have basic knowledge about AWS services like how to use S3 bucket, or how to add role or policy to services.
 
Also, you will need to have basic knowledge about SSH and Linux commands.
===Create an EMR cluster===
 
Search and choose EMR on AWS service panel.
92
edits

Navigation menu