search cancel

Questions on Agent response to Scheduler failover


Article ID: 257203


Updated On:


CA Workload Automation AE


Question 1: How the Shadow scheduler knows that Primary is active and is processing events?

Question 2: If the shadow scheduler is doing a heartbeat check to validate the status of the primary scheduler then where can I find those logs to prove that shadow is doing the check.?

Question 3: Where can I find the heartbeat intervals configuration settings?

Question 4: In a case, the Primary scheduler is processing the events and submitting Jobs on MACHINE_A and how the traffic is being handled.
Outbound traffic is the Primary scheduler submitting jobs on MACHINE_A and once the Job is processed how the status is updated in Autosys DB.
I am aware it's via scheduler but how does the agent decide whether the traffic (Job-status) has to send to the primary or shadow scheduler? 

If we say the Job was submitted by the primary scheduler so the status also sends to the primary scheduler.
But assume that a primary scheduler is down once the job is submitted to the agent and now the shadow is taking over to process events then how the agent knows the primary is down and how does it know we have to send the status to shadow since the job was submitted by primary scheduler?

Question 5: Where can we find Primary and Secondary scheduler details in Autosys?
How the communication is happening between the Primary and shadow scheduler?

Question 6: How the agent knows hows many schedulers we are talking to?


Autosys 12.x


Answer 1: Both the primary and shadow schedulers update the Autosys database (Event Server) with heart beat information.

For additional information see:

Answer 2: in the event_demon shadow log each time a heartbeat is sent there is a line such as:
[01/04/2023 10:25:00]     ----------------------------------------
[01/04/2023 10:26:00]     ----------------------------------------
[01/04/2023 10:27:00]     ----------------------------------------

You would need to increase logging to get more details. Which support would not suggest.

Answer 3: this is an environment variable that can be set.

For additional information see:

Answer 4: After the scheduler fails over from primary to shadow, the active scheduler sends notifications to all the active agents.
The agent then will send any further updates for job status to the active scheduler.

An AutoSys instance is identified in the agent config by the "communication.managerid_n" parameter.
It will always be in the format of "XXX_SCH" where XXX is the AutoSys instance name.
For each one of those entries in the agentparm.txt, there is a corresponding "communication.manageraddress_n" parameter, which is set to the host where the active scheduler is currently running for that instance.
When there is a scheduler failover in an AutoSys instance, the "communication.manageraddress_n" parameter for that instance on each active agent is automatically updated to the active scheduler's hostname.

When an AutoSys job is run on an agent, it associates each running job with a managerid.
Each time it needs to send a message back to the Scheduler for that job, it looks up the current value of the manageraddress for that managerid.
Therefore, if that manageraddress changes while a job is running, the completion event will be sent to the current active scheduler at the time the job completes.
No manual agent config update is required.

Answer 5: No communication takes place directly between the primary and shadow scheduler. This is all handled within the database.

For additional information see:

Answer 6: An agent can talk to multiple Schedulers at the same time as long as the instance ID are different.
This is different from an HA operation.
The agent keeps track of all schedulers it communicates with within the agentparm.txt

For additional information see: