What happens when the primary Scheduler is down and misses Autoscan

Products

Scheduler Job Management

Issue/Introduction

Questions:

Primary Scheduler CPUA runs its daily 'Schedule Scan' at 08:00 AM. ( CHINIT parm: AUTOTIM=0800, ). If CPUA Scheduler happens to be down at 08:00 AM (ex: CPUA outage for IPL ), and eventually comes up afterwards; ( ex: 09:00 am or later )

1a. Does CPUA Scheduler see that AUTOSCAN has not run for this day and will then run AUTOSCAN at whatever time it does come up?

1b. Do jobs that were supposed to be submitted between 08:00 and 09:00 will then get submitted, just running late as well as other schedules?

1c. If the AUTOSCAN time is missed by the Primary Scheduler CPUA, it'll catch up whenever Scheduler comes up?

2. For jobs that were running on other remote LPARS and CPUA comes down, then the job results for those jobs get queued up in that remote LPAR's ENF, and then when CPUA ENF and CPU Scheduler come back up, the connection between the remote and primary LPAR ENF's is re-established and then those other LPAR's job results are sent from the remote ENF to the primary ENF to the primary Scheduler to primary ADXXSTRT then posted into the CPUA Scheduler ( it uses external Datacom ADXXSTRT for its DB.). Each LPAR's ENF uses internal Datacom DB.

3. For the primary Scheduler CPUA, remote Scheduler CPUE, that share a JES spool: CPUE would still be considered a Remote LPAR, and jobs that run there, their results go into CPUE ENF, then to CPUA ENF, correct ? If CPUA is down, then CPUE job results are queued in CPUE ENF until CPUA ENF comes back up. Correct ?

Resolution

Answers:

1a. When CPUA Scheduler comes up, it will see that the 8 AM AUTOSCAN has not been done, and will run AUTOSCAN at that time.

1b. Jobs/schedules that are supposed to be submitted when AUTOSCAN completes but are not because CPUA Scheduler is down will now be submitted when AUTOSCAN completes.

1c. It's true that the primary Scheduler (CPUA) will catch up with job submission/posting if the AUTOSCAN time is missed, or if Scheduler is down.

2. A job is submitted from NODE A, LPAR A, and NJEs to NODE B, LPAR B. LPAR A is down. The job runs successfully on LPAR B. ENF on B gives the events to Scheduler on B, which stores them in Inter-Node records (a Scheduler table) until the connection with LPAR A is re-established. The status of the job in Scheduler (on the submitting node) will be updated when Scheduler on A comes back up. After the status is updated, the INR records are deleted.

3. Essentially the same thing happens with LPAR A & E on the same node (shared JES). The job is submitted, then Scheduler on A is down while the job runs on E. Scheduler on E gets the events from ENF on E in real time, then store them in Inter-CPU records until Scheduler on A comes back up. The status of the job in Scheduler (on the submitting node) will be updated when Scheduler on A comes back up. If the Shadow MUF is implemented, then Scheduler on E can take over automatically and become the primary Scheduler when Scheduler on A goes away. When using the Shadow MUF, the job status will update immediately.

Additional Information

Scheduler Bookshelf

- Scheduler Systems Programmer Guide:

. SHUT DOWN CAIENF on page 75

. UNKNOWN STATUS on page 76