Most recently… we have had a rash of these JES3 DM146 (“subtask abend”) SVC dumps related to Input Service processing where some batch ACID is getting suspended and when JES3 attempts to process the job, it causes this subtask abend and ends up percolating up to the point where JES3 itself is abending and taking an SVC dump. I don’t know if our TSS Admins are cracking down on some of these ACIDs and causing them to get suspended more often or what… but it just seems like this is happening a lot more often than it ever did. And it’s becoming quite annoying—every time this happens, we get an OPS alert and a nuisance SVC dump (unless DAE suppresses it).
Here’s an example of the basic sequence of events:
IAT9127 JOB (JOB20667) IS xxxxxxxx FROM yyyyyyyy (zzzzzzz ) IEF196I TSS7141E Use of Accessor ID Suspended IEF196I TSS7053I Default ACID <uuuuuuuu> Assigned IEF196I TSS7251E Access Denied to JESSPOOL IEF196I <xxxxxx.yyyyyyyy.JOB61536.D0000002.JESMSGLG> IEA045I AN SVC DUMP HAS STARTED AT TIME=23.45.01 DATE=05/29/2018 944 FOR ASID (004A) ERROR ID = SEQ00182 CPU00 ASID004A TIME23.45.01.4 QUIESCE = NO IEA794I SVC DUMP HAS CAPTURED: 945 DUMPID=001 REQUESTED BY JOB (JES3 ) DUMP TITLE=COMPON=JES3 GENERALIZED SUBTASK CTL,COMPID=SC1BA,ISS UER=IATGSC1-GSC1900
I previously opened a PMR with IBM JES3 support and essentially, they are telling me that this is working as designed. That may be… but *something* in our environment must have changed because we are getting a handful of these failures a week… sometimes a handful in a day… and I don’t recall this ever being an issue in the past.
Here’s the high-level summary of what IBM JES3 support had to offer:
“…No good answer here. I am confident that the JES3 code is behaving as we intend; relative to what we do when we get the RC8 from TSS on the IATXSEC call, surfacing the DM146, recycling the INTRDR. I don’t feel strongly enough to truly push TSS about it - in the sense that the RC8 is a valid RC and probably warranted in their eyes if the userID/ACID was suspended.
So I do suppose the conclusion I reach is that it is an odd timing scenario, relative to when the suspension occurs versus the submission of the job. You could float something out to TSS support to see if they have any suggestions how to avoid the situation, but I am not sure that TSS has a "problem".”
So… while I’m not sure that I can say that JES3 or CA-TSS has a code defect here, it seems like there should be a better way to “fail” these jobs… perhaps earlier in the submission process…? I know that this failure is happening rather early on in the job submission process, but is there possibly an even earlier point where this condition could be “reported back” to JES3 or failed even further “upstream” prior to JES3 Input Service? Could we change the JESSPOOL rules in some way that would make this “failure” a little less dramatic?
Environment
Release: Component: TSSMVS
Resolution
The probable cause for seeing this happen more recently is the installation of Top Secret solution SO01968 'SUSPENDED ACID USED IN A JOB DOES NOT FAIL'. Prior to that solution, we were NOT failing a batch job that was attempting to run under a suspended ACID. Of course this was incorrect and we fixed it with SO01968. So now you are seeing more of what is a correct scenario; a batch job attempts to run under a suspended user and it is flagged and forced to run under the default batch facility user uuuuuuuu. Since this user does not have access to protected JESSPOOL resources we correctly flag the security violation, and that is causing JES3 to handle an unexpected condition. Not much we can do about this, you could try giving uuuuuuuu access to the required JESSPOOL resources with a masked ACID name to cover all users that could be suspended, giving READ access only so no changes to the job spool can be made. From the IBM response it sounds like we just need to get past the submission portion of the job and then it will fail on resource access that would not affect the operation of JES3.