Worker Processes in 12.3 state they are active, but do not restart and a new process is started instead.

book

Article ID: 142100

calendar_today

Updated On:

Products

CA Automic Workload Automation - Automation Engine CA Automic Workload Automation - Automation Engine

Issue/Introduction

If you attempt to restart a process that ended abnormally, you see a new WP process start. If you attempt to start it again, the system reports it has been started, but it does not start - a new numbered or named WP starts instead.

For example, if the following are running:
WP1, WP2, WP3, WP4, WP5

and WP3 is stopped abnormally or suddenly and the WP process started again immediately
Result: WP process will connect as WP6.

Cause

Change in behavior in 12.3.

WP processes no longer bind to a specific socket. This means that, in cases of abnormal end, a new WP# is assigned when the process is restarted.

When utilizing the service manager, this means that the new process will utilize the service manager entry of the previous process. This causes the system to register that process as active when reattempting to start it.

Environment

Release : 12.3

Component : AUTOMATION ENGINE

Resolution

As this is part of the design changes present in 12.3, this behavior change is as designed.

There are two workarounds available:

1.) Wait 10 minutes before restarting an abnormally ended process as that allows the original WP# to be available.
-- This will stop the error from occurring, as the behavior of the WP's getting new numbers should no longer occur.

2.) Periodically refresh the service manager link in the Automic Web Interface.
-- This will re-align the linkage to the service manager, allowing for starting and stopping processes.

From the scenario earlier in this article, the following can be expected:

the following are running:
WP1, WP2, WP3, WP4, WP5

Then one of the following situations occurs:

Action: WP3 is stopped abnormally or suddenly and the WP process started again immediately
Result: WP process will connect as WP6.

Action: WP3 is stopped abnormally or suddenly, there is a wait of 10-15 minutes, WP process started again
Result: Most of the time, the WP process will connect as WP3 (this is due to the wait time that allows the MQSRV table to be "cleaned up" from any WPs that have been disconnected)

Action: WP3 is stopped from the service manager with "End Service" -> "Immediately single process", once the process shows stopped in the smgr, it is started again
Result: The WP process should connect as WP3