Error Message :
log4j:ERROR Failed to flush writer,
java.io.IOException: Bad file descriptor
Note: This article is meant for a CCB administrator only.
A CCB batch job gets stuck with the thread pool workers and is unable to proceed. The following messages will be seen in the job report:
Actual:
- 2016-12-15 22:40:09,064 [pool-1-thread-4] INFO (support.cluster.MemberLeftThread) Removing member id: 46 from cache because it left cluster
- 2016-12-15 22:40:09,066 [pool-1-thread-4] INFO (support.cluster.ClusteredNode) Removing member 46 from the cluster cache
- 2016-12-15 22:41:58,998 [pool-1-thread-1] INFO (support.cluster.MemberLeftThread) Removing member id: 106 from cache because it left cluster
- 2016-12-15 22:41:59,000 [pool-1-thread-1] INFO (support.cluster.ClusteredNode) Removing member 106 from the cluster cache
Expected:
- 2016-12-15 00:02:33,153 [Main Thread] INFO (api.batch.BatchRunStatusHelper) Ending BRT values - batch nbr: 225, rerun nbr: 0, status: 40
- 2016-12-15 00:02:33,155 [Main Thread] INFO (api.batch.BatchRunStatusHelper) Batch Number: 225
- 2016-12-15 00:02:34,167 [Main Thread] INFO (api.batch.SubmitBatch) Run ended successfully with exit code 0
Return Code 0
Investigation:
Check to ensure the job is running through the thread pool worker by following these steps:
- From the job report, you'll see something similar to this:
submitjob.sh -t 1 -g NNNN -l ENG -u SYSUSER -d 2016-12-15 -c 1 -b CM1MITD -p TPW_CCBPROD -x METER_SIZE_CHAR_TYPE="MTR SIZE",ITEM_TYPE="MTR1IN",EXCLUSION_CHAR_TYPE="CM-EXMTR",DISTRICT_CHAR_TYPE="DISTRICT",TO_DO_TYPE="CM1MITD"
Be sure to run the command from the command line in a putty session on the CCB batch server. This is critical since it will determine if it works correctly outside of Automic.
- Run the job as the 'cissys' user.
echo $SPLENVIRON
If it is blank, then there is nothing set. You will need to go back to the UI and copy the following value:
CCB > CONFIG > VARA.CCB.SETTINGS > SPLENVIRON_SCRIPT > copy the value to notepad.
Example:
. /u01/ccbprod/middleware/spl/CCBPROD/bin/splenviron.sh -e CCBPROD -c /bin/true
This will source the SPLENVIRON line and set the environment. You must have the environment set up before any CCB job will be able to run successfully.
Note: you may want to change the -d <date parameter>, if applicable.
- The CCB Admin can check the Thread pool workers as follows:
Command to check:
. /u01/ccbprod/middleware/spl/CCBPROD/bin/splenviron.sh -e CCBPROD -c /bin/true
This is a standard Oracle command. It has nothing to do with Automic.
cd $SPLOUTPUT
ls –lrt
If it takes a while to return a list of files, then there are too many (possibly old) CCB files. Our recommendation (for the CCB admin) would be to perform clean-up on a regular basis.
Note: Running the job from batch is different than running the job from the front-end. It is crucial to ensure that the job is able to run from batch. If a CCB developer believes there are no issues with the job, ensure that they are running the job from batch and not the front-end.
Objective:
You want to check each thread pool worker to ensure that it picked up that job.
Commands:
grep CM1MITD threadpoolworker.TPW_CCBPROD.2016121515*
grep -n CM1MITD threadpoolworker.TPW_CCBPROD.2016121515*
The thread pool worker needs to acknowledge and pick up the job.
Conclusion:
If it does not work from batch, then the issue is external and has nothing to do with Automic.
Recommendation:
At the end of the batch day, Oracle recommends killing and restarting the thread pool workers.
From the CCB side:- Enable tracing in the command “-g NNNN”
The CCB dev can set the command to “-g YYYY”, which will provide additional information since this command will enable trace for all parameters. - Validate the parameters
- Kill the thread pool workers
To verify:
ps -ef | grep submit
This will check to ensure there are TPW processes running.
Oracle Resources for Best Practices:
Batch Best Practices for Oracle Utilities Application Framework based products (Doc Id: 836362.1)
Production Environment Configuration Guidelines (Doc Id: 1068958.1)
Solution:
Within Automic: Kill the thread pool workers and restart it.
Outside of Automic: Submit the job from the CCB front end.