One of the Data Collector (DC) is down and not coming up after restarting the dcmd services. Could you help us in fixing the issue. Please find the below logs while restarting the services.
# systemctl status dcmd
● dcmd.service - Data Collector
Loaded: loaded (/etc/systemd/system/dcmd.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2022-12-27 04:27:04 EST; 4s ago
Process: 6958 ExecStop=/opt/CA/IMDataCollector/scripts/dcmd stop sysd (code=exited, status=0/SUCCESS)
Process: 7351 ExecStart=/opt/CA/IMDataCollector/scripts/dcmd start sysd (code=exited, status=0/SUCCESS)
CGroup: /system.slice/dcmd.service
├─7384 /opt/CA/IMDataCollector/ICMPD/IcmpDaemon --start
├─7423 /bin/sh /opt/CA/IMDataCollector/apache-karaf-4.3.3/bin/karaf server
└─7494 /opt/CA/IMDataCollector/jre/bin/java -Xms1024M -Xmx2033M -server -Xms1024M -Xmx2033M -XX:+UnlockDiagnosticVMOptions -XX:+UnsyncloadClass...
Release : 21.2
Usually, with DC connections like this, it's the DA queues being full, time issue between DA/DC, or something wrong with data/cache.
Run the following syntax to ensure you are on the active DA:
<installdir>IMDataAggregator/scripts/dadaemon status
Run the following syntax on the active DA:
<installdir>/DataAggregator/scripts/activemqstat
Name Queue Size Producer # Consumer # Enqueue # Dequeue # Forward # Memory %
DIM.requests.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f 11753 1 0 2071851 2060098 2060098 17
DIP-poll.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f 1122 1 0 1714282 1713160 1713160 40
DIP-req.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f 6928 1 0 56206 49278 49278 6
So these queue needs to be cleared on the DA machine:
On DA machine:
cd <installdir>/IMDataAggregator/scripts
./purgeOneQueue "DIM.requests.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"
./purgeOneQueue "DIP-poll.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"
./purgeOneQueue "DIP-req.responses.irep-dcm08p:28500308-6ba8-4b91-8f3d-5d9a4a55cc2f"
On DC machine:
1. systemctl stop dcmd
2. systemctl stop activemq
3. mv apache-karaf-*/data apache-karaf-*/data.old
4. mkdir apache-karaf-*/data
5. rm apache-karaf-*/deploy/local-jms-broker.xml
6. systemctl start dcmd (which will start activemq)
systemctl status -l dcmd
Latest DC 22.2.4 correctly creates the data and cleans up the deploy directory, and has a new clean target to move off the data to data.bak.
Symptom:
Assignment of IP Domain to a data collector through NetOps Portal might not take effect, due to timing issues reading new values from the data collector config file.
Resolution:
With this fix, the data collector restart script now clears the Apache Karaf cache to ensure new values are read correctly.
(22.2.4, DE548128)