Does the Data Collector (DC) keep running when the Data Aggregator (DA) is down and if so, what does the DC do with the polled data in such a situation?
Where does the DC store polled data when the DA is down?
All supported DX NetOps Performance Management releases
The Data Collector implements a mechanism of data caching in case the Data Aggregator is unavailable. Cached messages will be delivered to Data Aggregator when it is back online. That cache mechanism is implemented on the activemq broker.
Data Collector messages sent to Data Aggregator are defined with a non-persistent delivery mode. The broker caches them in memory when they are not consumed. That means they will not survive through an ActiveMQ broker restart.
Note that persistent delivery mode messages are always cached in disk, but this delivery mode is not used by the Data Collector.
In order to not overflow broker JVM memory, when memory usage for cached messages reaches the configured limit, broker begins to persist them in the disk under the {activemq.data}/dc_broker_<id>/tmp_storage folder.
The memory limit is defined in the {activemq.conf}/activemq.xml configuration file in the section <memoryUsage/>.
By default, broker establishes that memory limit up to 70% of the broker JVM max heap.
Data persisted to the disk is also limited by the broker. The broker establishes that limit in the section <tempUsage/>. By default, this value is set to 200 GB in {activemq.conf}/activemq.xml.
In parallel the Data Collector supervises disk usage via regular statistics received from the broker every 30 seconds.
The Default Disk Cache Size is equal to 50% of the maximum Data Collector memory (IM_MAX_MEM). By default, this value is equal to 45 minutes, or 500K, of data when you use a 5-minute poll rate.
Once the cache size is within 10% of 50% of the Data Collector value, it will start rolling data off the back of the queue.
Note the values will be different on your system. The DC uses disk space to cache poll messages (temporary messages).
Once the disk cache is full, it will start dropping the oldest polled messages, keeping the latest.
Example log message seen in the Data Collector karaf.log (/opt/IMDataCollector/apache-karaf/data/log/karaf.log
) when this occurs:
DATE TIME | WARN | pool-15-thread-1 | PRQCleanupService | e.jms.health.PRQCleanupService$2 135 | 175 - com.ca.im.common.core.jms - X.X.X.RELEASE-XXX | | JMS Health: dropped 178895/178895 messages from PRQ (dropRate=10%, maxDiskUsage=2566M)
Once the DA/DC connect, it will start sending the oldest messages first. So it may take some time before it's caught up and showing live data.
Note that if activemq is restarted, the cached data is lost.
See the Modify the External ActiveMQ Memory Limit documentation topic for more information on configuring the ActiveMQ memoryUsage value.
KB Article: Configure Data Collector disk space allocation for polled data cache during DA outage
KB Article: How to confirm that the DC is collecting data