Scheduler process hung due to One Automation Event Management Integration
search cancel

Scheduler process hung due to One Automation Event Management Integration

book

Article ID: 232169

calendar_today

Updated On:

Products

Autosys Workload Automation

Issue/Introduction

Scheduler fails to complete startup. The process hung after logging the below messages in the event_demon log. Does not proceed to complete the startup process after.
event_demon log($AUTOUSER/out/event_demon.$AUTOSERV) -
CAUAJM_I_40275 Log Rollover level set to <MIDNIGHT,SIZE(100),PURGE(365)>.
CAUAJM_I_40249 Debug level set to <LIGHT,HEAVY,DBQUERY,GBE>.
CAUAJM_I_40378 One Automation Event Management messaging interface initialized successfully
CAUAJM_I_40244 EnableIPCaching value set to <1>.
CAUAJM_I_40244 AggregateStatistics value set to <1>.
CAUAJM_I_10654 System is running in dual server mode. Event server 1: AUTODBPRI. Event server 2: AUTODBSHD.
CAUAJM_W_00173 System appears to have a running primary scheduler. Checking for primary activity.
CAUAJM_W_00176 The primary is not active. Startup proceeding.
CAUAJM_W_40330 The shadow is not active. Startup proceeding.
----
The process does not respond or stop using graceful shutdown (unisrvcntr stop waae_sched.$AUTOSERV). The below snippet is the pstack extract of the scheduler process after issuing the stop command -
#pstack <scheduler_pid>

Thread 4 (Thread 0x7f498d1dc6c0 (LWP 106115)):
#0 0x00007f499c690e9d in nanosleep () from /lib64/libpthread.so.0
#1 0x00007f498448eb71 in kpucpincrtime () from /home/oracle/product/19c/client/lib/libclntsh.so
#2 0x00007f499c689ea5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007f499b9009fd in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f498f3b5700 (LWP 106157)):
#0 0x00007f499c68da35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f499c43d0aa in msgMutex::ConditionedWait(bool) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#2 0x00007f499c43e9f2 in ProtectedInt::Wait() () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#3 0x00007f499d570009 in DbMonitor::Iterate() () from /opt/CA/WAAER12/autosys/lib/libcxmgr.so
#4 0x00007f499c43db97 in ThreadTE () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#5 0x00007f499c689ea5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f499b9009fd in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f498d1bb700 (LWP 106158)):
#0 0x00007f499b8f5ccd in poll () from /lib64/libc.so.6
#1 0x00007f498fd05c13 in Curl_poll () from /opt/CA/WAAER12/autosys/lib/libcacurl.so
#2 0x00007f498fd03db9 in curl_multi_wait () from /opt/CA/WAAER12/autosys/lib/libcacurl.so
#3 0x00007f498fcfc42e in curl_easy_perform () from /opt/CA/WAAER12/autosys/lib/libcacurl.so
#4 0x000000000065e144 in AsLogEDDAAppender::callRestAPI(AsString, CURLoption, AsString, AsString) ()
#5 0x000000000065efa0 in AsLogEDDAAppender::publishEddaEvent() ()
#6 0x000000000065f5af in AsLogEDDAAppender::write(char const*) ()
#7 0x00007f499c43516b in AsLogger::writetoappenders(char const*, char const*, bool, int) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#8 0x00007f499c435ff4 in AsLogger::log(AsString const&, int) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#9 0x00007f499c43cc38 in AsLogger::log() () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#10 0x00007f499c436140 in aslog::endl(AsLogStrm&) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#11 0x00007f499c45ac88 in operator<<(AsLogStrm&, AsLogStrm& (&)(AsLogStrm&)) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#12 0x000000000056df7e in EventHandlerThreadHelper::RegisterNext(char const*, int, int, EhxEvent&) ()
#13 0x000000000058540f in EpxThread::RequeEvents() ()
#14 0x0000000000680f72 in GlobalMethods::HdxPostStartup(Hads::eHADesignator, ProtectedBool*, bool, bool) ()
#15 0x0000000000681232 in HdxPostStartup(Hads::eHADesignator, ProtectedBool*, bool, bool) ()
#16 0x000000000059bdbb in HdxThread::Iterate() ()
#17 0x00007f499c43db97 in ThreadTE () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#18 0x00007f499c689ea5 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f499b9009fd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f499e5aa740 (LWP 104022)):
#0 0x00007f499c68dde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f499c43d610 in msgMutex::ConditionedTimedWait(int, bool) () from /opt/CA/WAAER12/autosys/lib/libasutil.so
#2 0x000000000059ab8a in HdxThread::HdxWait() ()
#3 0x000000000067b59d in GlobalMethods::createHdxThread() ()
#4 0x0000000000681561 in GlobalMethods::RunScheduler(eRolloverAuthority) ()
#5 0x0000000000684152 in main ()

 

Environment

Release : 12.0, 12.01

Component : CA Workload Automation AE (AutoSys)

Cause

The pstack extract (Thread 2) confirms that Autosys does not receive the response from One Automation Event management after the integration initialization.

Resolution

Autosys requires One Automation Event management to be up and running after the integration.

By design, In the environments where Autosys is integrated with One Automation Event management for event alerts, the connection is validated by sending an API request as part of the scheduler startup procedure. The process waits for it to return before initiating the job processing. In this case, The Automic One Event management has a problem in returning the API request due to which the Autosys scheduler process is hung.

To resolve the problem, Ensure the One Automation Event management is up and running fine. Send a test API and see if it returns successfully.

Contact One Automation experts to get the problem resolved.

To work around the problem disable the One Automation Event Management integration with Autosys -

Login to the Autosys server

Kill the scheduler process which was hung

Take a backup of the Autosys configuration file ($AUTOUSER/config.$AUTOSERV)

Update and set the value of the property: OneAutomationEvents to "0"

OneAutomationEvents=0

Save the file and restart the scheduler.