CAUAJM_E_40111 Unable to fetch job details using internal job identifier

book

Article ID: 198806

calendar_today

Updated On:

Products

CA Workload Automation AE - Business Agents (AutoSys) CA Workload Automation AE - System Agent (AutoSys) CA Workload Automation AE - Scheduler (AutoSys) CA Workload Automation Agent CA Workload Automation AE

Issue/Introduction

A job is continuously trying to update autosys database and failing.

When we tried to autorep the job it did not find any results. 

$ autorep -j T9604CC_PDG_Contflow_DXMLResponse_PRJ1_TEST_D

$ autoflags -a
1589 LINUX ORA 11.3.6 SP7 2a0af2b3 

[08/31/2020 02:46:31]      CAUAJM_E_40225 Trouble processing Event [DEV0bjz6fc00]!
[08/31/2020 02:46:31]      CAUAJM_E_10283 Exhausted maximum number of retries for scheduler operation [get_sched_info<423318>]
[08/31/2020 02:46:31]      CAUAJM_E_40111 Unable to fetch job details using internal job identifier <423,318>. Event processing for DEV0bjz6fd00 aborted.
[08/31/2020 02:46:31]      CAUAJM_I_40245 EVENT: ALARM            ALARM: EVENT_HDLR_ERROR JOB: T9604CC_PDG_Contflow_DXMLResponse_PRJ1_TEST_D
[08/31/2020 02:46:31]      <An error occurred while processing event <DEV0bjz6ey00> for job [T9604CC_PDG_Contflow_DXMLResponse_PRJ1_TEST_D 423318.206068755.0].>
[08/31/2020 02:46:31]      CAUAJM_E_40225 Trouble processing Event [DEV0bjz6fd00]!
[08/31/2020 02:46:31]      CAUAJM_E_10283 Exhausted maximum number of retries for scheduler operation [get_sched_info<423318>]
[08/31/2020 02:46:31]      CAUAJM_E_40111 Unable to fetch job details using internal job identifier <423,318>. Event processing for DEV0bjz6fe00 aborted.
[08/31/2020 02:46:31]      CAUAJM_I_40245 EVENT: ALARM            ALARM: EVENT_HDLR_ERROR JOB: T9604CC_PDG_Contflow_DXMLResponse_PRJ1_TEST_D
[08/31/2020 02:46:31]      <An error occurred while processing event <DEV0bjz6ez00> for job [T9604CC_PDG_Contflow_DXMLResponse_PRJ1_TEST_D 423318.206068755.0].>
[08/31/2020 02:46:31]      CAUAJM_E_40225 Trouble processing Event [DEV0bjz6fe00]!

Environment

Release : 11.3.6

Component : CA Workload Automation AE (AutoSys)

Resolution

For unknown reasons a job's definition was damaged in the database.
It was missing from tables such as ujo_job_tree and ujo_job_status.
When an agent tried to send a completion event for the job the scheduler had problems processing it as the ujo_job_status table had no entry for the joid.
The scheduler generated constant alarms for this.

The resolution was:

Stop the scheduler.
In the db run:

delete from ujo_job where joid = 423318;
delete from ujo_job_status where joid = 423318;
delete from ujo_job_tree where joid = 423318;
delete from ujo_job_cond where joid = 423318;
delete from ujo_sched_info where joid = 423318;
delete from ujo_event where joid = 423318;

Perform a cold start of the agent where the job runs from. 

Cold start = Stopping the agent, deleting the agent's database & log & spool directories, and restarting the agent.
Restart the scheduler.
Reinsert the job's jil definition.