search cancel

Data Aggregator heap rises steadily with APM agent enabled

book

Article ID: 230541

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

Data Aggregator heap rises steadily with APM agent enabled in versions 21.2.2<->21.2.7 

Environment

Release : 21.2.2<->21.2.7

Component : IM Data Aggregator

Cause

Defect: DE523480

Resolution

In DX Netops Performance Management 21.2.8 the configuration noted above is applied automatically:

Symptom: After upgrading to 21.2.2, the data aggregator memory can grow over time when DX NetOps Performance Management is integrated with DX Application Performance Management (an APM agent is configured). This is due to weak references being held onto for long time and can lead to large garbage collection, and the data aggregator to pause.
Resolution: With this fix, the APM agent config for the data aggregator now disables the autoprobe. It still collects memory and CPU statistics for the data aggregator.
(21.2.8, DE523480, 32905170,32957333,32908672,32979750)

in DX Netops 21.2.9 autoprobe is reenabled, but socket tracing is disabled:

Symptom: After upgrading to 21.2.2 and enabling the APM agent, the data aggregator memory can grow over time. This is due to weak references being held onto for a long period of time and can lead to large garbage collection and the data aggregator pause. There is a bug in the APM agent dealing with socket tracing.
Resolution: Until an update to the APM agent is available, with this fix, the NetOps Portal and data aggregator APM agent config files have been updated to disable socket tracing.
(21.2.9, DE523480, 32905170, 32957333, 32908672, 32979750)



===================================

At this time the following workaround can be used. for version 21.2.2-21.2.7

This will limit the APM agent to collecting Memory/CPU information but prevent the issue with the heap:


1) Edit the file:

/opt/IMDataAggregator/wily/core/config/IntroscopeAgent.DA.profile

Your path may vary if you did not install in the default location.

2) Change

introscope.autoprobe.enable=true

To:

introscope.autoprobe.enable=false

Note that if you have Fault Tolerant Data Aggregators, make this change on both Data Aggregators.

3) Restart the Data Aggregator:

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/performance-management/21-2/administrating/restart-ca-performance-management-component-services/restart-the-data-aggregator.html

Additional Information

After the fixes above you may still see a small increase in heap usage over time, followed by a drop.

This is because the APM in the Data Aggregator will still have weak references for some things, java will reclaim the memory, but may take many days before it does, with a short garbage collection.

Previously we saw that after 2 days, the Data Aggregator would go to +90% heap usage, and java would do a very expensive Garbage Collection which causes a long application pause time.