ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

Long running CCB job stuck with thread pool workers


Article ID: 84419


Updated On:


CA Automic Workload Automation - Automation Engine


Error Message :
log4j:ERROR Failed to flush writer, Bad file descriptor

Note: This article is meant for a CCB administrator only.

A CCB batch job gets stuck with the thread pool workers and is unable to proceed.  The following messages will be seen in the job report:

 -  2016-12-15 22:40:09,064 [pool-1-thread-4] INFO  (support.cluster.MemberLeftThread) Removing member id: 46 from cache because it left cluster
 -  2016-12-15 22:40:09,066 [pool-1-thread-4] INFO  (support.cluster.ClusteredNode) Removing member 46 from the cluster cache
 -  2016-12-15 22:41:58,998 [pool-1-thread-1] INFO  (support.cluster.MemberLeftThread) Removing member id: 106 from cache because it left cluster
 -  2016-12-15 22:41:59,000 [pool-1-thread-1] INFO  (support.cluster.ClusteredNode) Removing member 106 from the cluster cache

 -  2016-12-15 00:02:33,153 [Main Thread] INFO  (api.batch.BatchRunStatusHelper) Ending BRT values - batch nbr: 225, rerun nbr: 0, status: 40
 -  2016-12-15 00:02:33,155 [Main Thread] INFO  (api.batch.BatchRunStatusHelper) Batch Number: 225
 -  2016-12-15 00:02:34,167 [Main Thread] INFO  (api.batch.SubmitBatch) Run ended successfully with exit code 0
Return Code 0

Check to ensure the job is running through the thread pool worker by following these steps: 

  1. From the job report, you'll see something similar to this: -t 1 -g NNNN -l ENG -u SYSUSER -d 2016-12-15 -c 1 -b CM1MITD -p TPW_CCBPROD -x METER_SIZE_CHAR_TYPE="MTR SIZE",ITEM_TYPE="MTR1IN",EXCLUSION_CHAR_TYPE="CM-EXMTR",DISTRICT_CHAR_TYPE="DISTRICT",TO_DO_TYPE="CM1MITD"
Be sure to run the command from the command line in a putty session on the CCB batch server. This is critical since it will determine if it works correctly outside of Automic.
  1.  Run the job as the 'cissys' user.
                If it is blank, then there is nothing set.  You will need to go back to the UI and copy the following value:
CCB > CONFIG > VARA.CCB.SETTINGS > SPLENVIRON_SCRIPT > copy the value to notepad.
.  /u01/ccbprod/middleware/spl/CCBPROD/bin/ -e CCBPROD -c /bin/true
This will source the SPLENVIRON line and set the environment.  You must have the environment set up before any CCB job will be able to run successfully.
Note: you may want to change the -d <date parameter>, if applicable.
  1. The CCB Admin can check the Thread pool workers as follows:
Command to check: 
. /u01/ccbprod/middleware/spl/CCBPROD/bin/ -e CCBPROD -c /bin/true
This is a standard Oracle command. It has nothing to do with Automic.
ls –lrt
If it takes a while to return a list of files, then there are too many (possibly old) CCB files.  Our recommendation (for the CCB admin) would be to perform clean-up on a regular basis.
Note:  Running the job from batch is different than running the job from the front-end.  It is crucial to ensure that the job is able to run from batch.  If a CCB developer believes there are no issues with the job, ensure that they are running the job from batch and not the front-end.
You want to check each thread pool worker to ensure that it picked up that job.
grep CM1MITD threadpoolworker.TPW_CCBPROD.2016121515*
grep -n CM1MITD threadpoolworker.TPW_CCBPROD.2016121515*
The thread pool worker needs to acknowledge and pick up the job.
If it does not work from batch, then the issue is external and has nothing to do with Automic.

At the end of the batch day, Oracle recommends killing and restarting the thread pool workers.
From the CCB side:
  1.  Enable tracing in the command “-g NNNN”
    The CCB dev can set the command to “-g YYYY”, which will provide additional information since this command will enable trace for all parameters.
  2.  Validate the parameters ​
  3. Kill the thread pool workers

To verify:
ps -ef | grep submit

              This will check to ensure there are TPW processes running.
Oracle Resources for Best Practices:
Batch Best Practices for Oracle Utilities Application Framework based products (Doc Id: 836362.1)
Production Environment Configuration Guidelines (Doc Id: 1068958.1)

        Within Automic:  Kill the thread pool workers and restart it.
        Outside of Automic:  Submit the job from the CCB front end.


Cause type:
Root Cause: This is not an Automic issue. It is an issue with Oracle Utilities Customer Care and Billing.


OS: Unix



Fix Status: No Fix

Additional Information

Workaround :