IOException Job status not available when mapreduce job exits successfully
search cancel

IOException Job status not available when mapreduce job exits successfully

book

Article ID: 294675

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

Mapreduce job completes successfully but a Java IOException is returned when the job client queries for the current jobstatus.

13/12/26 13:12:58 INFO mapreduce.Job: The url to track the job:http://hdm1.hadoop.local:8088/proxy/application_1388082686190_0004/
13/12/26 13:12:58 INFO danl.WordCount: job is still cranking away...
13/12/26 13:17:58 INFO danl.WordCount: job is still cranking away...
13/12/26 13:22:59 INFO ipc.Client: Retrying connect to server: hdw1.hadoop.local/192.168.3.201:55559. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1 SECONDS)
13/12/26 13:22:59 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
Exception in thread "main" java.io.IOException: Job status not available
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:317)
at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:594)
at pivotal.eng.danl.WordCount.main(WordCount.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

MR1 use case example

job.submit();
while (! job.isComplete() ) {
   LOG.info("job is still cranking away using MR1 API...");
   Thread.sleep(interval);
} 

MR2 use case example

RunningJob rj = job.submitJob(conf);
while (! rj.isComplete() ) {
   LOG.info("job is still cranking away using MR2 API...");
   Thread.sleep(interval);
} 

In most cases, this problem will be observed intermittently because the exception is only thrown if the job client requests the job status after the application master has exited successfully.

Throughout the duration of the mapreduce application, the job client is getting all of the updates directly from the application master. Now that the application master is finished, the job client will be redirected to the mapreduce history server to collect the final job status.

13/12/26 13:22:59 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 

If the mapreduce history server is not up or if there is a configuration change to the yarn-site.xml without restarting the history server, then mapreduce history server will be unable to find the status of the given JobID for the job client.

Also in PHD 1.1.0, yarn-default.xml does not have the following mapped history params. If they are not explicitly defined in the yarn-site.xml, then these values will not get added to the mapreduce history server. The job history service will not be able to locate the job information without these two parameters:

  • mapreduce.jobhistory.intermediate-done-dir
  • mapreduce.jobhistory.done-dir

Environment


Resolution

1. Make sure yarn-site.xml and mapred-site.xml are properly configured. Start the mapreduce history service if it is not up. 

service hadoop-mapreduce-historyserver start  

2. Login to the mapreduce history server web interface and verify the following parameter values.

MR history server IP or hostname>:19888/conf 

Parameters to check:

  • mapreduce.jobhistory.intermediate-done-dir
  • mapreduce.jobhistory.done-dir