Increase CP USAGE (99% of 100%) causes delayed login and JOBs not starting

book

Article ID: 84527

calendar_today

Updated On:

Products

CA Automic Workload Automation - Automation Engine AUTOMIC WORKLOAD AUTOMATION

Issue/Introduction

Error Message :
U0003434 Server routine 'UCMAIN_R/MQREORG' required '7' minutes and '37,217' seconds for processing.
U0011647 Host variable 'UC_EX_VERSION ' set. Value: '8.00A022-901 '.

This issue is in regards to an Automation Engine (AE) system on version 9.00A517-7C1, Unix OS, and Oracle DB.

One CP is pegged down at 100%, while the other CPs are not.

Example:


<Please see attached file for image>

0EMb0000000PrfA.png
 

Cause

Cause type:
Configuration
Root Cause: 1. The automatic MQREORG is running far too often.
2. Network traffic increases when Nagle algorithm is enabled, which only allows one packet to be actively transporting on the network at any given time.
3. Found V8 SQL agents caught in connection loop.

Environment

OS: Unix

Resolution

Update to a fix version listed below or a newer version if available.

Fix Status: Released

Fix Version(s):
Automation Engine 12.0.x - Available
Automation Engine 11.2.x - Available

Additional Information

Workaround :
1. To disable the MQ* reorg from being triggered automatically by AE, go to Client 0 > UC_SYSTEM_SETTINGS > disable the MQ_CHECK_TIME by setting the value to 0.
 
Have the DBA implement the UC_REORG stored procedure (for Oracle DB v9, this query can be found in the uc_dll.sql file located in the db > oracle folder of the AE installation image). Run this outside of Automic, on the DB side, in intervals during off-peak hours that would not interfere with jobs in Production.

Note: It is expected that a 'no data found' exception is seen because this situation was not taken into consideration in the query for v9. It would be up to the DBA to take that exception into account and modify the V9 query as they see fit.
 
2. To address the network traffic issue, it is recommended that tcp_nodelay MUST be used in the Automation Engine (AE) system. Thus, changes need to be implemented to the following files.
 
sqlnet.ora:
TCP.NODELAY=yes https://docs.oracle.com/cd/E11882_01/network.112/e10835/sqlnet.htm#NETRF239
SQLNET.EXPIRE_TIME=60 https://docs.oracle.com/cd/E11882_01/network.112/e10835/sqlnet.htm#NETRF209

tnsnames.ora:
SERVER = DEDICATED http://docs.oracle.com/cd/E11882_01/network.112/e10835/tnsnames.htm#NETRF286

AE's ucsrv.ini:
[CPMsgTypes]
srvquery=1
[TCP/IP]
tcp_nodelay=1

3. V8 of any component is out of support. Please upgrade them. The permanent fix is to upgrade to the latest version.

Attachments

1558704400867000084527_sktwi1f5rjvs16lkh.png get_app