MapReduce job fails with "Unable to initialize any output collector"
search cancel

MapReduce job fails with "Unable to initialize any output collector"

book

Article ID: 294885

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

The MapReduce job or Hive query fails with the following error message: 

2015-04-24 11:41:41,861 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Unable to initialize any output collector

	at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)

	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439)

	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)

Environment


Cause

The error message "Unable to initialize any output collector" indicates that the job failed to start the containers. However, one must review the container logs at HDFS (Hadoop Distributed File System) to identify the cause of the error.


In this specific instance, the value of mapreduce.task.io.sort.mb is greater than 2047 MB. However, the maximum value allowed is 2047 MB. Thus anything above the maximum value causes the job(s) to fail, marking the value provided as Invalid.


The size of container logs reveals the error below:

2015-04-24 11:41:41,858 WARN [main] org.apache.hadoop.mapred.MapTask: Unable to initialize MapOutputCollector org.apache.hadoop.mapred.MapTask$MapOutputBuffer

java.io.IOException: Invalid "mapreduce.task.io.sort.mb": 2048

	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:975)

	at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:401)
 

Resolution

Set the value of the parameter mapreduce.task.io.sort.mb to less than 2048MB.