Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`? Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`? hadoop hadoop

Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`?


Consider in a scenario where you are setting up a cluster where each machine having 48 GB of RAM. Some of this RAM should be reserved for Operating System and other installed applications.


yarn.nodemanager.resource.memory-mb:

Amount of physical memory, in MB, that can be allocated for containers. It means the amount of memory YARN can utilize on this node and therefore this property should be lower than the total memory of that machine.

<name>yarn.nodemanager.resource.memory-mb</name><value>40960</value> <!-- 40 GB -->

The next step is to provide YARN guidance on how to break up the total resources available into Containers. You do this by specifying the minimum unit of RAM to allocate for a Container.

In yarn-site.xml

<name>yarn.scheduler.minimum-allocation-mb</name> <!-- RAM-per-container -> <value>2048</value>

yarn.scheduler.maximum-allocation-mb:

It defines the maximum memory allocation available for a container in MB

it means RM can only allocate memory to containers in increments of "yarn.scheduler.minimum-allocation-mb" and not exceed "yarn.scheduler.maximum-allocation-mb" and It should not be more then total allocated memory of the Node.

In yarn-site.xml

<name>yarn.scheduler.maximum-allocation-mb</name> <!-Max RAM-per-container-> <value>8192</value>

For MapReduce applications, YARN processes each map or reduce task in a container and on a single machine there can be number of containers.We want to allow for a maximum of 20 containers on each node, and thus need (40 GB total RAM) / (20 # of containers) = 2 GB minimum per container controlled by property yarn.scheduler.minimum-allocation-mb

Again we want to restrict maximum memory utilization for a container controlled by property "yarn.scheduler.maximum-allocation-mb"

For example, if one job is asking for 2049 MB memory per map container(mapreduce.map.memory.mb=2048 set in mapred-site.xml), RM will give it one 4096 MB(2*yarn.scheduler.minimum-allocation-mb) container.

If you have a huge MR job which asks for a 9999 MB map container, the job will be killed with the error message.