The MapReduce job or Hive query fails with the following error message:
2015-04-24 11:41:41,861 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Unable to initialize any output collector at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
The error message "Unable to initialize any output collector" indicates that the job failed to start the containers. However, one must review the container logs at HDFS (Hadoop Distributed File System) to identify the cause of the error.
In this specific instance, the value of mapreduce.task.io.sort.mb is greater than 2047 MB. However, the maximum value allowed is 2047 MB. Thus anything above the maximum value causes the job(s) to fail, marking the value provided as Invalid.
The size of container logs reveals the error below:
2015-04-24 11:41:41,858 WARN [main] org.apache.hadoop.mapred.MapTask: Unable to initialize MapOutputCollector org.apache.hadoop.mapred.MapTask$MapOutputBuffer java.io.IOException: Invalid "mapreduce.task.io.sort.mb": 2048 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:975) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:401)