Issue with job that posted QUEUEDJOB_STARTFAIL status
search cancel

Issue with job that posted QUEUEDJOB_STARTFAIL status

book

Article ID: 209378

calendar_today

Updated On:

Products

CA Workload Automation AE - Scheduler (AutoSys)

Issue/Introduction

Job went from QUE_WAIT to QUEUEDJOB_STARTFAIL to INACTIVE

Why?

And how can it be avoided?

Example:

job123                                -----                -----                IN    0/0

  Status/[Event]  Time                 Ntry ES  ProcessTime           Machine
  --------------  --------------------- --  --  --------------------- ----------------------------------------
  [STATE_CHANGE]  02/20/2021 10:00:07    0  PD  02/20/2021 10:00:07
    <Job <job123> cannot start due to higher priority queued QUE_WAIT job(s). Job placed in QUE_WAIT status.>
  [*** ALARM ***]
    QUEUEDJOB_STARTFAIL  02/20/2021 10:00:11    0  PD  02/20/2021 10:00:11
    <Job <job123> has been placed in an INACTIVE state and may require manual intervention.>
  [*** ALARM ***]
    QUEUEDJOB_STARTFAIL  02/20/2021 10:00:11    0  PD  02/20/2021 10:00:11
    <Job <job123> has been placed in an INACTIVE state and may require manual intervention.>

$ jr job123 -q


/* ----------------- job123 ----------------- */

insert_job: job123   job_type: CMD
command: echo 123
machine: machine123
condition: e(job234) = 0
std_out_file: "/tmp/job123.out"
std_err_file: "/tmp/job123.err"
job_load: 10
priority: 50
auto_delete: 0

The predecessor condition (job234) completed and was deleted before job123 could start

[02/20/2021 10:00:03] CAUAJM_I_40245 EVENT: CHANGE_STATUS STATUS: SUCCESS JOB: job234 MACHINE: machine123 EXITCODE: 0

[02/20/2021 10:00:08] CAUAJM_I_40245 EVENT: DELETEJOB JOB: job234 MACHINE: machine123

 

Environment

Release : 12

Component :

Resolution

The ways to avoid that issue would be :


1 - do not delete the dependent job until after the downstream job which depends on it completes.
 Consider using "auto_delete: 24" allowing 24 hrs to pass before deleting the job.
 Or specify an appropiate time to ensure the down stream job runs.
 For more details see:
 https://techdocs.broadcom.com/us/en/ca-enterprise-software/intelligent-automation/autosys-workload-automation/12-0-01/reference/ae-job-information-language/jil-job-definitions/auto-delete-attribute-automatically-delete-a-job-on-completion.html

2 - Possibly consider EvaluateQueuedJobStarts=0 
 That is a setting for the scheduler, in the $AUTOUSER/config.$AUTOSERV.
 It determines how the scheduler should bahave regarding re-evaluate startig conditions when bringing jobs out of que_wait or not.
 This can have a big impact on how things run so please make sure you read about this setting and test fully before jumping into changes.
 see the following for more details:
 https://techdocs.broadcom.com/us/en/ca-enterprise-software/intelligent-automation/autosys-workload-automation/12-0-01/administrating/ae-administration/configure-a-scheduler/evaluatequeuedjobstarts.html