How can we be notified if AutoSys stops processing events for any reason?
Release : 11.3.6
Component : CA Workload Automation AE (AutoSys)
What many customers do is set up a 'canary' job.
1) Create a new job on the server machine.
Call this job canary.
2) Set this canary job to run every 2 minutes.
This job should just run the command "sleep 2"
3) Create a separate script outside of autosys on the server machine.
4) Set this script to run indefinitely.
In this script, query the autosys db every 5mins to check the run_num for the canary jobs in the ujo_job_status table (below we use the ujo_jobst view).
SQL Example
select run_num from aedbadmin.ujo_jobst where job_name='canary';
5) The script should check to make sure that run_num is increasing.
If it does not, the scheduler or job is not running.
You should have the script then send some email or snmp trap or notify someone that the scheduler might not be running or is hung and they should investigate.