search cancel

Data Collector does not connect to Data Aggregator

book

Article ID: 256934

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

One of the Data Collector (DC) is down and not coming up after restarting the dcmd services. Could you help us in fixing the issue. Please find the below logs while restarting the services.

# systemctl status dcmd
● dcmd.service - Data Collector
   Loaded: loaded (/etc/systemd/system/dcmd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-12-27 04:27:04 EST; 4s ago
  Process: 6958 ExecStop=/opt/CA/IMDataCollector/scripts/dcmd stop sysd (code=exited, status=0/SUCCESS)
  Process: 7351 ExecStart=/opt/CA/IMDataCollector/scripts/dcmd start sysd (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/dcmd.service
           ├─7384 /opt/CA/IMDataCollector/ICMPD/IcmpDaemon --start
           ├─7423 /bin/sh /opt/CA/IMDataCollector/apache-karaf-4.3.3/bin/karaf server
           └─7494 /opt/CA/IMDataCollector/jre/bin/java -Xms1024M -Xmx2033M -server -Xms1024M -Xmx2033M -XX:+UnlockDiagnosticVMOptions -XX:+UnsyncloadClass...

 

Environment

Release : 21.2

Cause

Usually, with DC connections like this, it's the DA queues being full, time issue between DA/DC, or something wrong with data/cache.

Resolution

Run the following syntax to ensure you are on the active DA:

<installdir>IMDataAggregator/scripts/dadaemon status

Run the following syntax on the active DA:

<installdir>/DataAggregator/scripts/activemqstat

Name                                                Queue Size  Producer #  Consumer #   Enqueue #   Dequeue #   Forward #    Memory %

DIM.requests.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f       11753           1           0     2071851     2060098     2060098          17

DIP-poll.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f        1122           1           0     1714282     1713160     1713160          40

DIP-req.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f        6928           1           0       56206       49278       49278           6

So these queue needs to be cleared on the DA machine:

On DA machine:

cd <installdir>/IMDataAggregator/scripts
./purgeOneQueue "DIM.requests.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"
./purgeOneQueue "DIP-poll.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"
./purgeOneQueue "DIP-req.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"

 

On DC machine:

1. systemctl stop dcmd
2. systemctl stop activemq
3. mv apache-karaf-*/data apache-karaf-*/data.old
4. mkdir apache-karaf-*/data 
5. rm apache-karaf-*/deploy/local-jms-broker.xml
6. systemctl start dcmd (which will start activemq)

systemctl status -l dcmd

 

Additional Information

Latest DC 22.2.4 correctly creates the data and cleans up the deploy directory, and has a new clean target to move off the data to data.bak.

 

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/performance-management/22-2/release-notes/fixed-issues.html

Symptom:
 Assignment of IP Domain to a data collector through NetOps Portal might not take effect, due to timing issues reading new values from the data collector config file. 
Resolution:
 With this fix, the data collector restart script now clears the Apache Karaf cache to ensure new values are read correctly. 
(22.2.4, DE548128)

Attachments