Hi Team,
one of our capm collector (hostname)is down , we restarted dcmd services, still it is not connected , kindly provide steps to fix it.
In the DC Karaf.log
2022-05-20T14:27:57,127 | ERROR | 1be-29b70669d29c | Framework | rnal.framework.BundleContextImpl 969 | 12 - org.apache.felix.configadmin - 1.9.22 | | FrameworkEvent ERROR
org.osgi.framework.ServiceException: Exception in org.apache.karaf.service.guard.impl.GuardProxyCatalog$ProxyServiceFactory.ungetService()
Caused by: java.lang.IllegalStateException: BundleContext is no longer valid org.apache.felix.webconsole.plugins.ds_2.1.0 [127]
In the Shutdown.log:
WARN | eadpool-thread-3 | shutdown | om.ca.im.osgishell.Bund
leManager 33 | 52 - com.ca.im.data-collection-manager.osgishell - 21.2.8.RELEASE-335 | | Unable to connect t
o the Data Aggregator at host xxx.xx.xx.xxx. Shutting down the Data Collector. This may take up to 10 minutes.
Release :21.2
Component : PMDCOL
Activemq reconnected, but the consumers in DC didn't fully re-engage. So the DA AMQ started backing up those responses.
The same queue used for DC registration. So when dcmd was restarted, it sent a reg request, but never could get the response because of the backed up queue.
1. Checked activemq queues for the Data Collector, by running the following on the Data Aggregator
cd /opt/CA/IMDataAggregator/scripts
./activemqstat | grep <DC_Hostname>
2. Stopped activemq and dcdm on the Data Collector
systemctl stop activemq
systemctl stop dcmd
3. Cleared the activemq queues for the affected Data Collector, by running the following on the Data Aggregator
./purgeOneQueue "DIM.requests.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"
./purgeOneQueue "DIP-req.responses.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"
./purgeOneQueue "DIP-poll.responses.irep-<DC_Hostname>:28500308-6ba8-4b91-8f3d-5d9a4a55cc3B"
4. Start dcdm on the Data Collector
systemctl start dcmd