Failover process

book

Article ID: 221556

calendar_today

Updated On:

Products

CA Workload Automation AE - Scheduler (AutoSys)

Issue/Introduction

Hello Team,

We have weekly OS patchings for the AutoSys scheduler servers. Below is the process we follow in order to isolate the services.

1. Failover the primary scheduler onto shadow by running the command "sendevent -E STOP_DEMON -v FAILOVER".

2. Wait for the shadow scheduler to takeover and process the events for about 2 mins.

3. Now bring down the whole instance by running "sendevent -E STOP_DEMON -v ROLE=PAST".

4. After checking all the services are down, we interchange the RoleDesignators on primary & shadow in such a way that the maintenance server should have RoleDesignator as "2".

Note: This is for a precautionary measure we follow, if incase the scheduler comes up automatically post the reboot, it will just be a shadow scheduler and will not cause any impact.

5. Now, we bring up the primary scheduler process on server B and let it continue to run until the maintenance ends on server A.

We issue the failover command in Step 1 so that the agents will be updated about the current active scheduler and we see a message like this.

"Contacting active agents to send updated scheduler communication attributes"

Now, the question here is "Do we really need to issue failover command and continue the above steps or can we directly bring down the whole instance and interchange the roles?

Environment

Release : 11.3.6

Component : CA Workload Automation AE (AutoSys)

Resolution

Your logic is sound, if you are stopping everything anyway there is little reason to first initiate the failover via sendevent.
You should be able to simply stop everything, toggle the RoleDesignator values, save the config files, and restart the one you target as the new/current primary.
When that primary starts up it reaches out to all the agents to let them know "I am the primary, send me your statuses." 
Then when you are done you can again stop everything, toggle the RoleDesignator values, save the config files, and restart everything.