CAPM collector is not connected

book

Article ID: 242217

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration

Issue/Introduction

Hi Team,

one of our capm collector (lnx1acapm08p)is down , we restarted dcmd services, still it is not connected , kindly provide steps to fix it.

In the DC Karaf.log 

2022-05-20T14:27:57,127 | ERROR | 1be-29b70669d29c | Framework                        | rnal.framework.BundleContextImpl  969 | 12 - org.apache.felix.configadmin - 1.9.22 |  | FrameworkEvent ERROR
org.osgi.framework.ServiceException: Exception in org.apache.karaf.service.guard.impl.GuardProxyCatalog$ProxyServiceFactory.ungetService()

Caused by: java.lang.IllegalStateException: BundleContext is no longer valid org.apache.felix.webconsole.plugins.ds_2.1.0 [127]

 

In the Shutdown.log:

WARN  | eadpool-thread-3 | shutdown                         | om.ca.im.osgishell.Bund
leManager   33 | 52 - com.ca.im.data-collection-manager.osgishell - 21.2.8.RELEASE-335 |  | Unable to connect t
o the Data Aggregator at host 172.16.13.255. Shutting  down the Data Collector. This may take up to 10 minutes.

 

             

Cause

Activemq reconnected, but the consumers in DC didn't fully re-engage.  So the DA AMQ started backing up those responses.   
The same queue used for DC registration.  So when dcmd was restarted, it sent a reg request, but never could get the response because of the backed up queue.

Environment

Release :21.2

Component : PMDCOL

Resolution

1. Checked activemq queues for the Data Collector, by running the following on the Data Aggregator

    cd /opt/CA/IMDataAggregator/scripts

   ./activemqstat | grep <DC_Hostname>

 

2. Stopped activemq and dcdm on the Data Collector

   systemctl stop activemq

   systemctl stop dcdm

 

3. Cleared the activemq queues for the affected Data Collector, by running the following on the Data Aggregator

    ./purgeOneQueue "DIM.requests.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"

    ./purgeOneQueue "DIP-req.responses.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"

    ./purgeOneQueue "DIP-poll.responses.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"

 

4. Start dcdm on the Data Collector

     systemctl start dcdm