Primary scheduler and Application Server has crashed/stopped - Shadow has taken over
search cancel

Primary scheduler and Application Server has crashed/stopped - Shadow has taken over

book

Article ID: 257581

calendar_today

Updated On:

Products

Autosys Workload Automation

Issue/Introduction

Primary scheduler and application server has crashed/stopped, shadow has taken over the processing.

 

[01/11/2023 00:00:00]      CAUAJM_E_00035 Unable to open file </opt/CA/WorkloadAutomationAE/autouser.ACE/out/event_demon.ACE>
[01/11/2023 00:00:00]      CAUAJM_E_00036 Function failed: <open()> Errno: <13> Error: <Permission denied>
ORA-24550: signal received: [si_signo=11] [si_errno=0] [si_code=1] [si_int=0] [si_ptr=(nil)] [si_addr=(nil)]
kpedbg_dmp_stack()+297<-kpeDbgCrash()+75<-kpeDbgSignalHandler()+107<-skgesig_sigactionHandler()+219<-fileno_unlocked()+17<-_Z18EPPostFileRolloveri()+163<-_ZN25AsLogRolloverFileAppender12rolloverfileEv()+1023<-_ZN35AsLogRolloverFileAppenderTickThread7IterateEv()+617<-ThreadTE()+71<-start_thread()+201<-clone()+94

 

 

 

Environment

Release :

Cause

The problem happened because of permission change on the files/directory to $AUTOUSER/out  and a few other folders.

 

Exactly at midnight when the rollover happened,  that's when we got a permission denial to write to event_demon.ACEfile

[01/11/2023 00:00:00]      CAUAJM_E_00035 Unable to open file </opt/CA/WorkloadAutomationAE/autouser.ACE/out/event_demon.ACE>
[01/11/2023 00:00:00]      CAUAJM_E_00036 Function failed: <open()> Errno: <13> Error: <Permission denied>

same thing with as_server too:

[01/11/2023 00:00:00]      CAUAJM_E_00035 Unable to open file </opt/CA/WorkloadAutomationAE/autouser.ACE/out/as_server.ACE>
[01/11/2023 00:00:00]      CAUAJM_E_00036 Function failed: <open()> Errno: <13> Error: <Permission denied>

 

 

That led to the crash there after

Resolution

Customer identified that there was a system admin change in ownership / permissions on the folders/files which led to this issue.  Once this was fixed, the issue was resolved