We generated an Autosys job run report and Autosys event report for failed jobs with the same timeframe for both the reports.
We can see a difference in report failure counts. Why is the finding different in the job failure count?
iDash Workload Automation
The explanation of why iDash is behaving the way it is due to the following:
1. Create a box with a job inside
The job has a bad command so it would fail to run
/* ----------------- test_box ----------------- */
insert_job: test_box job_type: BOX
owner: root@machine_name
permission:
date_conditions: 1
days_of_week: all
start_times: "10:17"
alarm_if_fail: 1
/* ----------------- test_job ----------------- */
insert_job: test_job job_type: CMD
box_name: test_box
command: bad 60
machine: machine_name
owner: root@machine_name
permission:
date_conditions: 0
alarm_if_fail: 1
2. @ 10:17
Both the Box and job run and FAILED
a) AutoSys autorep command should show both the box and job in FAILURE status
autorep -j test_box
Job Name Last Start Last End ST/Ex Run/Ntry Pri/Xit
________________________________________________________________ ____________________ ____________________ _____ ________ _______
test_box 11/02/2020 10:17:00 11/02/2020 10:17:02 FA 414/1 1
test_job 11/02/2020 10:17:01 11/02/2020 10:17:01 FA 414/1 127
b) iDash AutoSys Run Report should show the box with FAILURE status
c) AutoSys's ujo_job_runs table should contain a row for this box with status code 5 (for FAILURE)
3. Now correct the job's command using AutoSys jil
update_job: test_job
command: sleep 60
4. Run the job again
sendevent -E FORCE_STARTJOB -j test_job
Both the box and job should now have SUCCESS status.
a) AutoSys autorep command should show both the box and job in SUCCESS status
autorep -j test_box
Job Name Last Start Last End ST/Ex Run/Ntry Pri/Xit
________________________________________________________________ ____________________ ____________________ _____ ________ _______
test_box 11/02/2020 10:17:00 11/02/2020 10:27:15 SU 414/1 0
test_job 11/02/2020 10:26:15 11/02/2020 10:27:15 SU 414/2 0
b) Note that for the box, only the "Last End", "Status" and exit code have changed.
The "Last Start" stay the same indicating that it is the same run.
In the AutoSys's ujo_job_runs table, there is only 1 row for this box which now has a status code of 4 (for SUCCESS)
In iDash, we follow the AutoSys behavior by updating the box status from FAILURE to SUCCESS.
This is why when you run the AutoSys Job Run Report, you only see the SUCCESS status of the box and not the FAILURE status.
Note: That for events, AutoSys does not replace old event, hence you would see both the event when the box change to FAILURE and when it changes to SUCCESS.
The same is reflected in iDash Event Report.