WP's loop when DST switch and CP_PERIOD execution occur at same time
search cancel

WP's loop when DST switch and CP_PERIOD execution occur at same time

book

Article ID: 273010

calendar_today

Updated On:

Products

CA Automic Workload Automation - Automation Engine CA Automic One Automation

Issue/Introduction

WP's Loop when DST switch happens at the same time as next C_PERIOD execution

Users faced an AE production outage, where all WP's stuck in an irreversible loop.

PWP is no longer responding, all process disconnect, the system is no longer responsive

Currently the Daylight Saving Time(DST) switch happened at the same time as next C_PERIOD execution.

If C_PERIOD executions falls on the the same minute than the DST, AE will stuck. The engine is no longer operational

It must be restarted, sometimes with a COLDSTART.

Similar messages to this sequence could show up in the WP logs when the problem occurs:

2023-03-15 20:26:28 - U00011333 Container for periodical task 'SCRIPT_OBP_SWITCH' has successfully started.
2023-03-15 20:30:02 - U00011089 Task 'SCRIPT_SWITCH' starting ...
.....
2023-03-16 04:30:08 - U00011091 Task 'SCRIPT_SWITCH' (RunID '0001094215') ended.
2023-03-16 05:30:08 - U00011089 Task 'SCRIPT_SWITCH' starting ...
2023-03-16 05:30:08 - U00011090 Task 'SCRIPT_SWITCH' is now active (RunID '0001093077').
2023-03-16 05:30:08 - U00011091 Task 'SCRIPT_SWITCH' (RunID '0001093077') ended.
2023-03-16 13:12:17 - U00011329 Task 'SCRIPT_SWITCH' did not start on '16.03.2023 at '01:00' due to Server downtime.
.....
2023-03-16 13:12:17 - U00011329 Task 'SCRIPT_SWITCH' did not start on '16.03.2023 at '07:00' due to Server downtime.
2023-03-16 13:12:33 - U00011625 Logging was changed.

 

 

Environment

Automation Engine 12.3.9 and 21.0.5

Cause

This is a defect of the Automation Engine

Resolution

Workaround: Restart Automation Engine, sometimes a COLD start is required.  If a COLD start is absolutely not a possibility, please open a case with Support.  It still may be necessary.

Resolution: A problem has been fixed where WP's looped when DST switch happened at the same time as next C_PERIOD execution. This only happened on Western Hemisphere.

This bug is fixed. The fix will be delivered with in version Automation Engine E 21.0.8 HF 1 (please note that updating the Automation Engine to 21.0.8 hf1 also requires that the intialdata, utilities, and AWI be updated to 21.0.8 HF1 as well).

 

Additional Information

What can be verified in order to proactively detect if a system can be affected by this problem and what can be done to prevent it?

Abstract:

The issue can happen if there is a recurring task (C_PERIOD) scheduled to execute at the exact minute the time change happens.
Example, for the upcoming time change in October (3am -> 2am CET) an execution gets triggered at exactly the hour (00).
There are two-three important preconditions:
1. The System and/or Client uses a time zone in the western hemisphere (UTC minus something)
2. During the time change, the time 'goes back'
3. The execution of the task falls exactly at the minute (ex 2:00) when the time change happens.
During tests, this was not reproducible always, sometimes a combination of the original time zone (Client 0, server time) played a role as well.

Any customer that has recurring tasks that fit the above criteria might run into the issue, but the risk is still fairly low.

IMPORTANT : It is not possible to tell exactly with an SQL or something because the time of the next execution is relevant. 
This gets calculated AFTER the previous execution ends.

If the next check falls at the moment of the time change - this might bring the problem.

Workarounds on AE system which are not running 21.0.8hf1:

1)
As the C_PERIOD triggers a new execution at the exact time change anything to prevent a new execution from being trigged is the correct way.
The easiest would be to suspend the C_PERIOD object before the time change, and re-enable it after.
In Process Monitoring - select all active C_PERIOD (Filter for Period Container) and Right Click -> Modify -> Suspend.
After the time change had happened then re-activate them once again:
In Process Monitoring - select all active C_PERIOD (Filter for Period Container) and Right Click -> Modify -> Go.
Any starts that were supposed to happen while the tasks were disabled might have to be started manually, depending on the batch processing logic.

2)
Fine tuning of the alignment:
Most of the tasks run aligned at the hour - always at :00. This could be also changed, so that the tasks start at few minutes after the hour.
However his might not be feasible for tasks that run in shorter minute intervals.

#3:
Another way would be to transform the C_PERIOD container in a schedule:
Example: If the task is supposed to run hourly then create a schedule with the task set at every hour, 24 entries for each start.

Bug ID: AE-31904
Bug Title: WP's Loop when DST switch happens at the same time as next C_PERIOD execution