Restart run with Smart/RESTART and Xpediter fails to find checkpoint with message SRSI117W No DDNAME of CKPT
search cancel

Restart run with Smart/RESTART and Xpediter fails to find checkpoint with message SRSI117W No DDNAME of CKPT


Article ID: 219947


Updated On:


Smart Restart


Restart on Xpediter runs fails to find the checkpoint dataset and thus considered a new run even when asking/expecting a restart run:

In the SRSPRINT output see:
SRSI117W - No DDNAME of CKPT was found - will assume an initial run   
SRSI226I - New checkpoint dataset:                                   
SRSI226I - Data Set Name xxxxxxx.xxxxxxxx.xxxxxx.xxxxxxxx.xxxxxxxx 
SRSI226I - has been successfully defined and allocated to DDname CKPT 

It appears the issue is with the Xpediter preprocessor which changes the step name.
This causes the default checkpoint dataset not to be found.

After manually updating the step name, the checkpoint dataset can be found but still fails saying job id not matching.  
Forcing a specific JOBID() and VERIFY_JOBID(ON) in both original and restart job this can be overcome. 

From the debug runtime option display using AJID.P which assume this is job step and program.          

How can we make Xpediter runs eligible for restart automatically and not require explicit parameter changes?


Release : 12.1.2
Component : Smart/Restart


The JOBID is critical to allowing a restart run to locate the status record created
by the previous failing run.  Among other things, the name of the checkpoint dataset, which is critical 
and may have been dynamically allocated by the original run.

The JOBID can be derived in several different ways to hopefully provide the flexibility to generate an id 
that is guaranteed unique across all restart-enabled jobs which could execute and overlap times between
original job start and final successful completion.  Note that if restarts are to be attempted manually 
these time frames can be long, maybe days.

There are two parameters you can provide: AUTO_JOBID()  and  JOBID().
Auto_JOBID  (alias AUTO_JID) is described in the Smart/RESTART Reference chapter 14.
JOBID is described in the Smart/RESTART Reference chapter 12.

AUTO_JOBID() can be set to:
TIMESTAMP:  (AJID=T) timestamp
JOBNAME:  (AJID=J) jobname
SYSTEMANDJOBNAME:   (AJID=S) system name + jobname
JOB_STEP_AND_PGMNAME:  (AJID=P)  jobname + step name + main step program name
JOB_STEP_AND_TASKPGM:  (AJID=K) jobname + step name + application TCB initial program name

In this example did not code AUTO_JOBID() so can see in the Profile print that your default is AJID=P.

JOBID, if coded, allows you to make up your own jobid with up to 32 characters.  If you are running a test job
just pick something unique to this test run and you don't have to worry about colliding with any other work.
Just be sure and use the same setting on any restart job that its original job used.

Since AJID=P, then the jobid for this example would be jobname+stepname+programname.
If the first run doesn't have Expediter, this makes sense:  'SXP0007A' + 'STEP10  ' + 'AE621   '
while the restart run with Expediter has:    'SXP0007A' + 'STEP10  ' + 'XPTSO  '
They don't match so the restart run thinks it is a new job.

By manually coding a JOBID you ensured the initial and restart runs matched, as long as
someone else did not run a job with the same jobid value before you finished.

Note also that if you code a checkpoint dataset DD in the JOB JCL this becomes pretty much a non-issue
because it is guaranteed that any restart job will be given the same checkpoint dataset. 
This is basically what the AJID=T (timestamp) setting is for, which will make a guaranteed unique key for the
global restart jobs table, and the checkpoint dataset location is done thru JCL for the restart job.

For how to make this simpler, that depends on your process. 
Are the jobnames unique, meaning you would never run another job with the same name until after the
current one has successfully completed one way or another?
If so, you can just set AUTO_JOBID(JOBNAME)  and you should be good, with or without Expediter.

If not, how about jobname+stepname?

If not perhaps setting JOBID() to something unique for this particular job, like JOBNAME + Application name.
You would not have to change all your jobs.  But anytime you had a need to test with Expediter you could
change that job to its unique id and then leave it there afterwards.  NOTE: this will not work if you submit
multiple copies of the same job in parallel unless you arrange for each to have a unique JOBID.

If not, you need to set AUTO_JOBID to generate something unique regardless of Expediter, so if
JOB_STEP_AND_PGMNAME does not work, how about  JOB_STEP_AND_TASKPGM? 
The latter might work but it depends on how Expediter works and would take some experimentation. 

If all else fails there is an option to create a user exit that will decide what the JOBID should be. 
Just write a little program to make up an appropriate jobid token, place the program in the steplib, and add its name to
the RAINPUT (or system defaults) using JOBID_EXIT(name). 
Appendix A of the Smart Jobstream Series Installation Guide provides what you need to get started.  There is also
a sample exit in DCASAMP called SRSXJBID.