Delay in job processing during AutoSys failover scheduler rollover
search cancel

Delay in job processing during AutoSys failover scheduler rollover

book

Article ID: 428078

calendar_today

Updated On:

Products

Autosys Workload Automation

Issue/Introduction

When performing a Scheduler failover using the sendevent command, the process takes several minutes to complete․
This delay occurs before the shadow scheduler fully takes over and resumes job processing․​‌‌‌‌​
You may observe a lag time longer than expected during testing․

SYMPTOMS:

  • Failover takes several minutes to complete

  • Lag time observed during EP rollover

CONTEXT: Testing Scheduler Failover with command: sendevent -e stop_demon -v failover

Environment

 

  • AutoSys Workload Automation (AutoSys) 12.X, 24,X

  • Operating System: [Platform Independent]

 

Resolution

EXPLANATION:
During a failover, the shadow scheduler must check the status of all defined agents․
It sends an update to every agent to inform them that it is the new active scheduler․
This process typically takes a few minutes to propagate․

If the environment contains many agents that are offline, missing, or unreachable, the process slows down significantly due to connection timeouts and retries․

STEPS:

  1. CLEAN UP MACHINE DEFINITIONS

    Review the current machine definitions in the environment․

    Identify agents that are:

    • Decommissioned

    • Permanently offline

    • Unreachable

    Remove or update these definitions to ensure the scheduler only attempts to contact active agents․

    EXPECTED: Reduced failover time as the scheduler contacts fewer unreachable agents․

  2. CAPTURE DEBUG LOGS (IF DELAY PERSISTS)

    If the delay remains excessive after cleanup, enable debug mode logging on both the primary and shadow schedulers during a failover test․

    Review logs to identify specific timeouts or bottlenecks․