Two of our WILY agents are crashing.
The ABEND is occurring due to an Out Of Memory condition error.
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2020/12/02
Release : 10.5
Component : Cross-Enterprise Application Performance Management
The log shows that the core APM Isengard is running out of memory either transmitting metrics or traces to the EM.
Customer needs to either reduce the metric load, or to increase the heap size.
Out-of-memory errors generally happen because there are too many MQ metrics due to a large quantity of Queues.
Normally the Cross-Enterprise_APM_Dynamic.properties file is updated so that less queue managers and or queues are monitored.
The configuration properties for controlling that are:
SYSVIEW.MQ.QMs.regex=*
SYSVIEW.MQ.Queues.regex=*
SYSVIEW.MQ.Alerts.QMs.regex=*
Since this is intermittent and customer is on the edge of having enough memory then customer can merely increase the heap space.
The heap space is controlled by the STDENV.JCL member.
Increase the -Xmx option to a higher value in STDENV.JCL
JVM_OPTS="-Xmx512m -Xms256m"
So for example:
JVM_OPTS="-Xmx1024m -Xms256m"
The XAPI timeout errors are happening because customer has the update interval at 15 seconds.
So the xapi timeout is being set to 15.
This can be reduced by setting this below property to something larger. Like 60 seconds:
SYSVIEW.xapi.timeout.interval=60
It might be that the 15 second interval is causing metrics to pile up faster than they are delivered to the EM.
You can try setting the SYSVIEW.update.interval to a larger interval.
For example:
SYSVIEW.update.interval=60
This would cause 1/4 the amount of metrics traffic being sent between the SYSVIEW and CEAPM, and between CEAPM and the EM.