After a period of Agent disconnects, the PWP begins to see large increases and utilization, and multiple lines similar to:
20200608/090951.299 - U00011183 Client '0101', RunID '0762422582'. Message 'EXAKTJ' for Agent 'AGENT_NAME1' was acknowledged with error code 63 (not connected). Job status will be changed to READY_FOR_RUN.
20200608/090951.330 - U00011183 Client '0101', RunID '0761590350'. Message 'EXAKTJ' for Agent 'AGENT_NAME2' was acknowledged with error code 63 (not connected). Job status will be changed to READY_FOR_RUN.
Usually the symptoms seen are something like:
1.) Agent generates core dump / ends. WP's do not appear to reflect the issue.
2.) WP shows JOB_INF / losconn processing.
3.) PWP then goes into the following loop:
20200706/234942.912 - U00000063 Partner 'AGENT_NAME1' is not connected to the server.
20200706/234942.932 - U00011183 Client '4000', RunID '0081848970'. Message 'EXAKTJ' for Agent 'AGENT_NAME1' was acknowledged with error code 63 (not connected). Job status will be changed to READY_FOR_RUN.
20200706/234942.953 - U00011183 Client '4000', RunID '0081848975'. Message 'EXAKTJ' for Agent 'AGENT_NAME1' was acknowledged with error code 63 (not connected). Job status will be changed to READY_FOR_RUN.
20200706/234942.985 - U00011183 Client '4000', RunID '0081848971'. Message 'EXAKTJ' for Agent 'AGENT_NAME1' was acknowledged with error code 63 (not connected). Job status will be changed to READY_FOR_RUN.
20200706/234942.997 - U00000063 Partner 'AGENT_NAME1' is not connected to the server.
4.) PWP will also show many lines like this:
U00011175 Negative JOB_INF was sent from Agent 'AGENT_NAME1'. Job name 'JOB_NAME' (RunID '0081868437'), old job status 'Start initiated', new job status 'Unknown'
5.) System starts to become more and more unresponsive, MQPWP count pushed up to over 500k, jobs started to get stuck in preparing/generating against that Agent.
Release : 12.3
Component : AUTOMATION ENGINE
Agents did not complete connection to the AE correctly, this is causing communication for activation for JOBS to be mishandled by the AE.
Resolution
Fixed in AutomationEngine 12.3.6+hf3 or higher
Workaround
1) Note any Agent names listed with the error code 63 or U00011175 Negative JOB_INF and restart them. After restart, attempt to run a job and confirm it is no longer writing messages to the PWP.
2) Here is a possible workaround to monitor for the U00011183 "U code" - please note this is just an example and to put this in place requires the help of someone knowledgeable with Automic scripting: