JCP loops a reconnect if there are slowdowns on reconnecting

book

Article ID: 225530

calendar_today

Updated On:

Products

CA Automic One Automation

Issue/Introduction

There are a number of symptoms that can be seen with this issue:
The system is unavailable to log into although the WPs show everything running.

The JCP logs will show something like:
20210824/154523.432 - 51 U00003524 UCUDB: ===> Time critical DB call! OPC: 'EXEC' time: '80115ms'
20210824/154523.433 - 51 U00003525 UCUDB: ===> 'UPDATE MQSRV SET MQSRV_LastUpdate = ? WHERE MQSRV_Name = ?'
After this many threads begin to disconnect and reconnect which we can see with the messages:
20210824/155144.740 - 57 U00003545 UCUDB: Opening database ...
The final message in the log before the customer restarted the JCP is:
20210824/165823.492 - 51 U00045014 Exception 'com.automic.kernel.osgi.NoServiceException: "No registered service for 'com.automic.network.api.NetworkConnections', filter:null"' at 'com.automic.kernel.osgi.OSGIRegistryFunctions.lookup():44'.
20210824/165823.493 - 51 U00003620 Routine 'com.automic.kernel.impl.DefaultExceptionHandler' forces trace because of error.
20210824/165823.496 - 51 U00003450 The TRACE file was opened with the switches '0000000000000000'.
20210824/165826.416 - 51 U00003449 Output to the TRACE file is finished.

on multiple threads (the thread number is after the date/timestamp but before the U code).

A JCP restart resolves the issue and allows for logins.

Environment

Release : 12.3

Component : AUTOMATION ENGINE

Resolution

The root cause is a network slowdown between the JCP and the database or a slowdown on the database that causes the JCP to not be able to reconnect in a timely manner and this gets compounded and results in a loop.  This looping will be resolved in a future release of 12.3, but the root cause needs to be identified by network, database, and other system admins on the site where they occur.

The workaround is to restart the affected JCP.