There are three components of Hadoop :
Hadoop HDFS -
Hadoop Distributed File System (HDFS) is the storage unit of Hadoop.
HDFS splits files into blocks and sends them across various nodes in form of large clusters.
It is highly fault-tolerant and is designed to be deployed on low-cost hardware.
It provides high throughput access to application data and is suitable for applications having large datasets
Hadoop MapReduce -
Hadoop MapReduce is the processing unit of Hadoop.
MapReduce is a computational model and software framework for writing applications that are run on Hadoop.
These MapReduce programs are capable of processing enormous data in parallel on large clusters of computation nodes.
Hadoop YARN –
Hadoop YARN is a resource management unit of Hadoop.
This is a framework for job scheduling and cluster resource management.
YARN helps to open up Hadoop by allowing to process and run data for batch processing, stream processing, interactive processing and graph processing which are stored in HDFS.
It helps to run different types of distributed applications other than MapReduce.
No comments:
Post a Comment