Performance Management releases r3.7.4 and older
ActiveMQ OutOfMemory (OOM) errors can trigger processing problems in the ActiveMQ queues without the ActiveMQ service being restarted. Once ANY DA irep queue in ActiveMQ for a DC stops being processed by the DC, the DA no longer consumes from DADistIrepManager, which will cause DCs to fail to restart.
OutOfMemory (OOM) errors can be seen in the activemq.log files on the Data Collectors which are the trigger for the problem. The logs are found in the (default path) /opt/IMDataCollector/broker/apache-activemq-<version>/data directory.
Sample of the errors seen:
2019-12-08 07:17:32,408 | ERROR | Checkpoint failed | org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal Checkpoint Worker
java.lang.OutOfMemoryError: Java heap space
2019-12-08 07:17:32,408 | INFO | Ignoring no space left exception, java.io.IOException: Java heap space | org.apache.activemq.util.DefaultIOExceptionHandler | ActiveMQ Journal Checkpoint Worker
java.io.IOException: Java heap space
The r3.7.5 release contains new code, via defect DE409714, which enables an ActiveMQ service restart automatically on Data Collectors when ActiveMQ records an OOM error.
To take advantage of the same change until able to install release r3.7.5 or newer, the following changes will help resolve this.
On the Data Collector edit the activemq script found in the directory (default path) /opt/IMDataCollector/scripts.
There are two changes we need to make in that file.
A. The following are the changes that should be made near the top of the file. This is what a default file looks like.
ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS_MEMORY -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=$activemqhome/conf/login.config";export ACTIVEMQ_OPTS
This is what it would look like after the recommended changes.
# Next line is new to enable AMQ restart on OOO re: DE409714 fixed in r3.7.5 and newer releases
ACTIVEMQ_OPTS_OOM="-XX:OnOutOfMemoryError='$dchome/scripts/activemq restart'"
# Original line
# ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS_MEMORY -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=$activemqhome/conf/login.config";export ACTIVEMQ_OPTS
# New line to enable AMQ restart on OOO re: DE409714 fixed in r3.7.5 and newer releases
ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS_MEMORY $ACTIVEMQ_OPTS_OOM -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=$activemqhome/conf/login.config";export ACTIVEMQ_OPTS
B: The second change is in the section that begins with:
start() {
echo "Starting ActiveMQ"
In the line after the done statement for ACTIVEMQ_OPTS we need to add the ACTIVEMQ_OPTS_OOM reference added to the top of the file. After editing it should be:
ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS_MEMORY $WILY_OPTS $ACTIVEMQ_OPTS_OOM -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=$activemqhome/conf/login.config";export ACTIVEMQ_OPTS
To make the changes:
After the AMQ service is restarted, the new process listing should contain "-XX:OnOutOfMemoryError=/opt/IMDataCollector/scripts/activemq restart" which indicates the change was made successfully.
After that, if the DC's encounter further OOM errors, the AMQ will be restarted automatically.
Did the ActiveMQ service restart on my Data Aggregator? If the auto-restart is used, there are a few things that will show it.