Issue:
Is there a way to query Workload Automation AE to get a listing of all jobs that have failed?
Environment:
Workload Automation AE 11.3.6
Database: oracle
Platform: any
Resolution:
Workload Automation AE keeps track of the jobs and their runs in the database. By default, 7 days of historical runs and events are kept. To see the run of a specific job one may issue a command like: autorep -J jobname -d. That would show the current run details. If there is a specific run number one is looking, add the -r <#> option. Example: autorep -J jobname -d -r 1234.
If the goal is to generate a complete list of all jobs that have failed one could use the monbro command to accomplish this.
First create the monbro definitions within jil. Here are two examples:
insert_monbro: track_failure
mode: b
failure: y
currun: n
insert_monbro: failure_monitor
mode: m
failure: y
Then run them
monbro -N failure_monitor
monbro -N track_failure
In the example above failure_monitor will watch for new incoming failure events. The track_failure monbro will query the database and list all past failure events.
Sample output
The same type of information can be extracted directly from the database but the format of the time will be in UNIX epoch time.
Example:
select j.job_name, r.startime, r.endtime, r.run_num, r.ntry, r.status, r.exit_code
from aedbadmin.ujo_job_runs r, aedbadmin.ujo_jobst j
where j.joid=r.joid
and r.status in (5)
order by j.job_name,r.endtime;
NOTE: use "time0 -a <#>" to convert the times into a more human friendly display.
Additional Information:
See the Workload Automation AE reference guide or wiki site for more details on monbro.