Execution Server going offline even services are running fine
search cancel

Execution Server going offline even services are running fine

book

Article ID: 226329

calendar_today

Updated On:

Products

CA Release Automation - Release Operations Center (Nolio)

Issue/Introduction

We are seeing execution servers are showing unreachable in ROC UI, even though service is up and running. Restarting service by clearing active-mq data folder bring execution server online but after 5-10 minutes it again goes offline.

Environment

Release : 6.7.X

Component : CA RELEASE AUTOMATION CORE

Database: ORACLE

Cause

In log review, the below error was identified in nolio_dm_all.log/s, which reflect that there is some sort of unique constraint violation.

2021-10-14T09:39:40.338Z [org.springframework.jms.listener.DefaultMessageListenerContainer#3-1] WARN  (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:143) - SQL Error: 1, SQLState: 23000
2021-10-14T09:39:40.338Z [org.springframework.jms.listener.DefaultMessageListenerContainer#3-1] ERROR (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:144) - ORA-00001: unique constraint (NOLIO.SYS_C0021205) violated

2021-10-14T09:39:40.338Z [org.springframework.jms.listener.DefaultMessageListenerContainer#3-1] WARN  (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:143) - SQL Error: 1, SQLState: 23000
2021-10-14T09:39:40.338Z [org.springframework.jms.listener.DefaultMessageListenerContainer#3-1] ERROR (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:144) - ORA-00001: unique constraint (NOLIO.SYS_C0021205) violated

2021-10-14T09:39:40.339Z [org.springframework.jms.listener.DefaultMessageListenerContainer#3-1] ERROR (org.hibernate.engine.jdbc.batch.internal.BatchingBatch:119) - HHH000315: Exception executing batch [ORA-00001: unique constraint (NOLIO.SYS_C0021205) violated]

Troubleshooting Steps

  • To get details of index violation listed in above error run below query on RA DB
SELECT *
  FROM USER_CONSTRAINTS U
 WHERE U.CONSTRAINT_NAME = 'SYS_C0021205';
SELECT *
  FROM USER_CONS_COLUMNS UC
 WHERE UC.CONSTRAINT_NAME = 'SYS_C0021205';
SELECT *
  FROM USER_INDEXES I
 WHERE I.index_name = 'SYS_C0021205';
SELECT *
  FROM USER_IND_COLUMNS IC
 WHERE IC.index_name = 'SYS_C0021205';

The above query result showed the index is created on table servers on column JXTA_NAME. 

  • Run below query to get details from servers table.
select upper(s.jxta_name), count(*) over (partition by upper(s.jxta_name)), s.*
  from servers s
  order by upper(s.jxta_name), s.id
  • From result of above query, it was identified that there were misconfigured agents, i.e. agents with same JXTA_NAME in the table servers causing the above unique constraint violation. The NES's are going offline as soon as the misconfigured agents try to connect the execution servers.

Resolution

Identify the misconfigured agents with same JXTA_NAME and remove them from ROC UI -> Agent Management and than restart the NES's.

Attachments