What is the maximum value for mapreduce.task.io.sort.mb? What is the maximum value for mapreduce.task.io.sort.mb? hadoop hadoop

What is the maximum value for mapreduce.task.io.sort.mb?


I realize this question is old, but for those asking the same question you can check out some of the bugs around this value being capped

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_releasenotes_hdp_2.1/content/ch_relnotes-hdpch_relnotes-hdp-2.1.1-knownissues-mapreduce.html

BUG-12005: Mapreduce.task.io.sort.mb is capped at 2047.

Problem: mapreduce.task.io.sort.mb is hardcoded to not allow values larger than 2047. If you enter a value larger then this the map tasks will always crash at this line:

https://github.com/apache/hadoop-mapreduce/blob/HDFS-641/src/java/org/apache/hadoop/mapred/MapTask.java?source=cc#L746


"mapreduce.task.io.sort.mb" is the total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks. So you need to ensure you have 100000 MB memory available on the Cluster nodes.