What is a container in YARN?

It represents a resource (memory) on a single node at a given cluster.
A container is

supervised by the node manager
scheduled by the resource manager

One MR task runs in such container(s).

There can be multiple containers on a single Node (or a single very big one).

Every node in the system is considered to be composed of multiple containers of minimum size of memory (say 512MB or 1 GB). The ApplicationMaster can request any container as a multiple of the minimum memory size.

Source, see section ResourceManager/Resource Model.

hadoop mapreduce hadoop-yarn

In Hadoop 2.x, Container is a place where a unit of work occurs. For instance each MapReduce task(not the entire job) runs in one container.

An application/job will run on one or more containers.

Set of system resources are allocated for each container, currently CPU core and RAM are supported. Each node in a Hadoop cluster can run several containers.

In Hadoop 1.x a slot is allocated by the JobTracker to run each MapReduce task. Then the TaskTracker spawns a separate JVM for each task(unless JVM reuse is not enabled).

CodeHunter

What is a container in YARN?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last