Kahadb running the DA out of space
search cancel

Kahadb running the DA out of space

book

Article ID: 144807

calendar_today

Updated On:

Products

CA Infrastructure Management CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

The data aggregator is running out of space.  Upon further investigation we have found that the /opt/IMDataAggregator/data/broker/kahadb is using all the space with several db-*.log files

Environment

Performance Management all versions

Cause

Possible corrupt file preventing cleanup

Resolution

To resolve do the following:

  1. Stop activemq service
    • Redhat 7: systemctl stop activemq
    • Redhat 6: service activemq stop
  2. mv /opt/IMDataAggregator/data/broker/kahadb to /opt/IMDataAggregator/data/broker/kahadb.orig
  3. Start  activemq service
    • Redhat 7: systemctl start activemq / systemctl status activemq
    • Redhat 6: service activemq start / service activemq status
  4. The kahadb will be recreated on startup
  5. After confirming the activemq process is running  (see step #2), the kahadb.orig directory can be removed with the following command: rm -rf kahadb.orig from the /opt/IMDataAggregator/data/broker/ directory

You should now have your space back

Additional Information

Also see Solution in Data Aggregator Disk Space is Decreasing guide

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/performance-management/23-3/troubleshooting/data-aggregator-disk-space-is-decreasing.html

Solution: Errors can cause orphaned rollup messages to build up in Apache ActiveMQ. These message files are large. Verify the presence of the messages and purge them if necessary.
 
As per Engineering, something must be corrupted. He suggested stopping ActiveMQ and moving off kahadb, then starting ActiveMQ.

He doesn't know a way to force it to find/fix the corrupted db*.log file.

The Port 8161 (http://DA_host:8161/admin/queues.jsp) is no longer listening OOTB, a security issue.

Is necessary to use the DA's activemqstat in scripts directory:
./activemqstat | awk '{print $1" queueSize="$2" Producer="$3" Consumer="$4" Enqueue="$5" Dequeue="$6" Forward="$7" MemoryUsage="$NF}' | grep ActiveMQ.DLQ
 
If queueSize greater than 0 run the purgeOneQueue script:
./purgeOneQueue " ActiveMQ.DLQ" 

But the best way is restart the ActiveMQ and they suggested the best time to do ActiveMQ restart is between hh:45 and hh:00 when the rollup queues have been read.

Example of the best time to restart ActiveMQ:
from 14:46 (after hh:45) to 14:59 (and before hh:00)