When do YARN and NameNode interact When do YARN and NameNode interact hadoop hadoop

When do YARN and NameNode interact


Namenode: Stores the meta-data of all the data stored in data nodes and monitors the health of data nodes. Basically, it is a master-slave architecture.

YARN: It stands for Yet Another Resource Negotiator. The yarn has mainly two components.

1.> Scheduling

2.> Application Manager

Yarn also contains the master, i.e Resource Manager and Slave, i.e Node Manager.

For scheduling purpose, there are 3 Schedulers:

1.> FIFO 2.> Capacity 3.> Fair-share

There is a component called Application Master assigned by Resource Manager under the Node Manager.

One application master is assigned to one application.

The job is directly submitted by the client and Resource Manager assigns the job to the Application Master and Node manager monitors the liveliness of Application Master

Now, whenever the job comes in, Resource manager creates a job id and assign an Application Master for that job. Resource Manager contacts to the Namenode to retrieve the information about the metadata of the required data on which the task has to be performed. And the information received by Resource Manager is then passed to Application Master.

This is the basic overview of the working of Yarn with Namenode. You can also read in detail from YARN

Also, NameNode interaction is just in the Hadoop applications running within YARN that talk to the NameNode. Not all YARN applications need to communicate with HDFS


Basically there is no direct interaction between YARN and HDFS, see https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

However YARN jobs require some files (libraries, configuration, etc) which usually resides on HDFS