Customer has multiple production jobs (1000+) stuck in RESWAIT status.
When checking the resources that these jobs should be up against, those resources are fully free. But the jobs are not going to processing any further.
Adjusting the priority of the job and force start on the same was letting them process fine.
Component : CA Workload Automation AE (AutoSys)
Customer made a change to have some jobs to have a resource that was originally decommissioned. This change was not noticed and the number of jobs waiting on other resources increased because of this.
Customer's application team found that there were a handful of jobs that were looking for a resource that was decommisioned (meaning, resource name exists, but the AVAILABLE units was set to 0).
That change was backed out, which basically meant - the jobs with incorrect machine name now have correct machine name (which had available units). This cleared up all the jobs and a big chunk of jobs went from RES_WAIT to running status and got completed in few minutes.