CE APM WILY agent JVM out of memory
search cancel

CE APM WILY agent JVM out of memory

book

Article ID: 204897

calendar_today

Updated On:

Products

Cross Enterprise Application Performance Management (APM)

Issue/Introduction

Two of our WILY agents are crashing.

The ABEND is occurring due to an Out Of Memory condition error. 

JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2020/12/02

Environment

Release : 10.5

Component : Cross-Enterprise Application Performance Management

Cause

The log shows that the core APM Isengard is running out of memory either transmitting metrics or traces to the EM.

Resolution

Customer needs to either reduce the metric load, or to increase the heap size.

Out-of-memory errors generally happen because there are too many MQ metrics due to a large quantity of Queues.

Normally the Cross-Enterprise_APM_Dynamic.properties file is updated so that less queue managers and or queues are monitored.
The configuration properties for controlling that are:
SYSVIEW.MQ.QMs.regex=*
SYSVIEW.MQ.Queues.regex=*
SYSVIEW.MQ.Alerts.QMs.regex=*

Since this is intermittent and customer is on the edge of having enough memory then customer can merely increase the heap space.
The heap space is controlled by the STDENV.JCL member.
Increase the -Xmx option to a higher value in STDENV.JCL
JVM_OPTS="-Xmx512m -Xms256m"

So for example:
JVM_OPTS="-Xmx1024m -Xms256m"

The XAPI timeout errors are happening because customer has the update interval at 15 seconds.
So the xapi timeout is being set to 15.
This can be reduced by setting this below property to something larger. Like 60 seconds:
SYSVIEW.xapi.timeout.interval=60

It might be that the 15 second interval is causing metrics to pile up faster than they are delivered to the EM.
You can try setting the SYSVIEW.update.interval to a larger interval.
For example:
SYSVIEW.update.interval=60
This would cause 1/4 the amount of metrics traffic being sent between the SYSVIEW and CEAPM, and between CEAPM and the EM.