When a CA Workload Automation AE job raises the below error, what does it mean?
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
[*** ALARM ***]
AUTO_PING
<COMM_ERR_14 Agent on machine [machine_name] has not acknowledged this job request. Please investigate the status of this job.>
Release:11.3.6, 12.x
When a machine is unavailable, an Autosys job submitted to that machine will go to the pending machine status (PEND_MACH).
This interrupted job will be started, once the machine comes back online.
This error can happen when a job is sent to the machine successfully, but an acknowledgement is not received from the Agent in a timely manner.
When the Agent does not immediately acknowledge the job start event, the CA WA AE Scheduler will raise the AUTO_PING alarm with COMM_ERR_14 message to alert of a potential manual intervention, that may be required for the job.
If RUNNING and SUCCESS statuses are later received for the job, the alarm can be ignored and no manual intervention is required.
The most common reasons for similar situation to occur are:
- A network glitch between the Scheduler and Agent, which might cause the acknowledgement from the Agent to be missed.
- Agent on that machine can be busy processing a large number of jobs and did not reply quickly enough.
- The Agent machine is under a heavy CPU load due to other applications running on that machine and the Agent did not have enough CPU resources to process the requests in a timely manner.