Changes

← Older edit

GPU621/ApacheSpark

974 bytes added, 13:36, 26 November 2018

→‎Finance and Stock trading Use Case

Spark is advanced data processing/analysis model which is replacing MapReduce

Spark does not have its own file system so it run on the top of HDFS

[[File:10a.PNG]]

=== Spark vs MapReduce ===

[[File:3.PNG]]

== Features ==

In memory computations

Faster than MapReduce for complex application on disks

[[File:2abc.png ]]

== Resilient Distributed Datasets (RDDs) ==

Transformations

Create a new data set from existing one

[[File:5bc.PNG ]]

Actions

Return a value to the driver program after running computation on data set

[[File:6.PNG]] These examples and more are found at https://spark.apache.org/docs/latest/rdd-programming-guide.html == Examples & == === Word Count === [[File:4.PNG]] Using transformations ( flatmap, map, reduceByKey ) to build a data set of string and int pairs. It is then saved into a file === Finance and Stock trading Use Case ===

It Imagine that you are working for a financial company and your job is ~~used~~ to buy in ~~healthcare~~and buy out stocks to make money. The decision you make highly depends on the prediction which is calculated by your financial model. In this kind of situation, ~~media~~how long it takes for your financial model to make a prediction is very critical. We know that the price of stocks change very fast. In a couple seconds a stock can change prices drastically. Thus, ~~finance~~if your model cannot provide you a near real time response, ~~retail, travel~~you might lose your opportunity to trade your stocks with the best price. Apache Spark can be utilized to create financial models to make predictions in real time.

~~=== Finance and Fraud Detection ===~~[[File:7ab.png]]

Sathia

33

edits

Changes

GPU621/ApacheSpark

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools