Supervisor engine stops checking job conditions

book

Article ID: 87204

calendar_today

Updated On:

Products

CA Automic Dollar Universe

Issue/Introduction

In universe.log we find the following Warning Message during a high load period of many Jobs in status Event Wait :
|WARN |X|IO |pid=pid.threadid| o_module_sur_cycle | Interrupting supervisor job condition checks because the time limit was exceeded. The job runs conditions will be check less often than specified until further notice.

The Supervisor engine stops checking resources so that jobs waiting for a Resource remain in status Event Wait.

IMPORTANT: the following INFO message can't be found minutes after the WARN message appear:
|INFO |X|IO |pid=pid.threadid| o_module_sur_cycle | Supervisor cycle execution time back to acceptable. The condition checks are back to being done at normal interval. 

If the message appears minutes later in the log, there is nothing to worry about, the supervisor should check resources in a timely manner again.

It is observed that the Supervisor did not check resources until the engine is manually stopped and started.

Right after the Engine Supervisor is restarted, we see the INFO  message in the universe.log: "Supervisor cycle execution time back to acceptable. The condition checks are back to being done at normal interval"

Cause

Cause type: Defect
Root Cause: The Supervisor would not finish all the checks before the end of its cycle, causing and endless cycle.

Environment

OS: All
OS Version: N/A

Resolution

Workaround :
Stop and Start the Supervisor engine.


Update to a fix version listed below or a newer version if available.


Fix Status: Released

Fix Version(s):
Component(s): Application.Server

Dollar Universe 6.8.21 - Available