Challenges when ESP is down for many hours
search cancel

Challenges when ESP is down for many hours

book

Article ID: 16547

calendar_today

Updated On:

Products

ESP Workload Automation

Issue/Introduction

ESP will not be started for 24 to 48 hours later on DR site. 

What are the pitfalls to face when ESP attempts to schedule 48 hours of missed processing?

Environment

Component: ESP Workload Automation
Release: ALL

Resolution

Here are the possible challenges:
1. Events scheduled or triggered into future from the disaster time till next scan time will be executed at once after ESP is up;

2. Correct processing depends also on application logic (job that should run on Monday should have RUN MONDAY coded and not RUN ANY and be dependent on the event being scheduled on Monday);

3. Active workload at the disaster time, i.e. submitted or executing, will be marked as SYSTEM ERROR; 

4. CPU spike is expected when ESP is brought up and the event initiators will be in use until the backlog of missed events has cleared up, which might delay the workload scheduled soon after ESP is up;

5. Operational datasets might get problems with space or high utilization might be detected: CKPT, APPLFILE/TRAKFILE, COMMQ, HISTFILE, QUEUE (JOBINDEX and JOBSTATS might be affected as well, if new job names were introduced);

6. TCELL & DSTCELL buffer overflow, which might lead to loss of tracking or data set trigger data. It depends on the amount of workload being tracked at the same time and the settings related to TCELL and DSTRIG parameters in ESPPARM; 

7. Variables in active and new applications that are not based on ESPS variables can be set with wrong values; 

8. Variables and resources can be set with wrong values if the related commands are processed out of sequence; 

9. If the file (like checkpoint, queue etc) is filled, then ESP may abend and can’t stay up or need to reformat the related files first. 

Note: To avoid most of the above problems, suggest using TIMEREF on TIMEZONE with LOCAL in ESPPARM, which will start ESP with a past date/time. Then issue TIMEREF command from page mode to get ESP move forward.

Additional Information