When the agent's logging folder has no free space left (file system full), It begins to flood MQ Process tables.
The Agent begins to flood one or the other MQ Table with repetitive messages.
Sometimes the message in the MQ table contains the name of the agent and:
This can be checked in table MQ*PWP by reading the MQPWP_MSG with a query similar to the following (MQ2PWP or MQ1PWP depending on your system):
FROM MQ2PWP where MQPWP_FAddr='NAME_OF_THE_IMPACTED_AGENT'
If MQPWP table is flooded, this can causes the system to be unresponsive (no login anymore possible, no job processing).
It can also be observed that MQOWP table that was flooded or a MQ*CP00* table
This time the problem was detected, traces were activated on WPs, Agent stopped.
Agent Process traces (ucxjlx6_t00.txt) are filled up with such messages
logging =../15 - Logging int Logging interru logging
In PWP trace file (WPsrv_trc_001_00.txt), we detected these kind of suspicious unknown messages.
20200923/094059.546 - process_message_queue(uc4_error_t *) <-- (deadlock or nodata, do it later)
If the agent is stopped in a way or the other, the problem disappear, the impacted queue is emptied within a few minutes and the AE system works then normally again.
Component: Unix/Linux Agent
Versions affected: 12.2.2 and superior and 12.3.3 and superior
A problem has been fixed where the Automation Engine becomes slow and unresponsive if a Unix/Linux agent has no free space left to write its agent logs.
Update to a fix version listed below or a newer version if available.
Component(s): Unix/Linux Agent
Automation.Engine 12.2.8 - Available
Automation.Engine 12.3.4HF1 - Available
Automation.Engine 12.3.5 - Available
The bug comes from agent, upgrading the agent is sufficient to fix the problem.
Details of the bug fix:
To avoid flooding the PWP queue with CHGLOGR messages, before sending a change log request the Unix agent now checks the log file descriptor to verify that a log change is possible.