Changes

GPU621/Apache Spark

2 bytes added, 15:52, 30 November 2020

→‎Architecture

== Architecture ==

Hadoop has a master-slave architecture as shown in figure 3.1. A small Hadoop cluster consists of a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, NameNode, and DataNode. A worker node acts as both a task tracker and a DataNode. A file on HDFS is split into multiple blocks and each block is replicated within the Hadoop cluster. NameNode is the master server while the DataNodes store and maintain the blocks. The DataNodes are responsible for retrieving the blocks when requested by the NameNode. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

== Components ==

Abalachandran7

73

edits

CDOT Wiki β

Changes

GPU621/Apache Spark

CDOT Wiki ^β