Does the Data Collector (DC) keep running when the Data Aggregator (DA) is down and if so, what does the DC do with the polled data in such a situation?
Dx NetOps Performance Management all versions
The Data Collector uses a disk cache and writes polled data to it when the DA is down or not responding due to network or other issues.
The Default Disk Cache Size This value is equal to 50% of the maximum Data Collector memory (IM_MAX_MEM
). By default, this value is equal to 45 minutes, or 500K, of data when you use a 5-minute poll rate.
Once the cache size is within 10% of 50% of the Data Collector value, it will start rolling data off the back of the queue.
Note the values will be different on your system. The DC uses disk space to cache poll messages (temporary messages).
Once the disk cache is full, it will start dropping the oldest polled messages, keeping the latest.
Example log message seen in the Data Collector karaf.log (/opt/IMDataCollector/apache-karaf/data/log/karaf.log
) when this occurs:
DATE TIME | WARN | pool-15-thread-1 | PRQCleanupService | e.jms.health.PRQCleanupService$2 135 | 175 - com.ca.im.common.core.jms - X.X.X.RELEASE-XXX | | JMS Health: dropped 178895/178895 messages from PRQ (dropRate=10%, maxDiskUsage=2566M)
Once the DA/DC connect, it will start sending the oldest messages first. So it may take some time before it's caught up and showing live data.
If you add the following you can monitor the cache burndown:
/opt/IMDataCollector/apache-karaf-2.4.3/etc/org.ops4j.pax.logging.cfg
, add:log4j.logger.com.ca.im.core.jms.health.JmsBrokerHealthAnalyser=DEBUG,sift
log4j.additivity.com.ca.im.core.jms.health.JmsBrokerHealthAnalyser=false
/opt/IMDataCollector/apache-karaf/etc/org.ops4j.pax.logging.cfg
uncomment:#
# JMS Health logging
#
log4j2.logger.JMSHealth.name = com.ca.im.core.jms.health
log4j2.logger.JMSHealth.level = DEBUG
log4j2.logger.JMSHealth.appenderRef.sift.ref = sift
This will create a log file named:
com.ca.im.common.core.jms.log
Under:
/opt/IMDataCollector/apache-karaf-*/data/log
Note that if activemq is restarted, the cached data is lost.