Changes

GPU621/Apache Spark

272 bytes added, 11:46, 30 November 2020

→‎Architecture

== Architecture ==

Hadoop has a master-slave architecture as shown in figure 1.1. A small Hadoop cluster consists of a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, ~~Name Node~~NameNode, and ~~Data Node~~DataNode. A worker node acts as both a task tracker and a ~~data node~~DataNode. A file on HDFS is split into multiple blocks and each block is replicated within the Hadoop cluster. NameNode is the master server while the DataNodes store and maintain the blocks. The DataNodes are responsible for retrieving the blocks when requested by the NameNode. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

Abalachandran7

73

edits

Changes

GPU621/Apache Spark

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools