search cancel

Data interruption frequently on the CDV dashboard

book

Article ID: 258242

calendar_today

Updated On:

Products

DX Application Performance Management

Issue/Introduction

We are facing an issue with APM application where there are data interruptions frequently on the CDV dashboard.

 

Environment

Release : 10.7.0

Resolution

Customer did not have memory settings in EMService.conf correct. Had default of 512 and 1024 . So resource-starved

 

Additional Information

What was done. 
-Reviewed issue. Not seeing graphics as expected. (Compared to one working with another scale.

-Looked at Status Monitor and was clamping. Seeing a lot of slow collector messages which were wiping out the environment
-Looked at the perflog log and showed had low heap and was getting OOM heap dumps.
-Issue was that the 12388MB on MOM and 8912 on the collector was set in the lax file instead of EMService.conf. That is covered in 
https://knowledge.broadcom.com/external/article/93176/apm-introscope-enterprise-manager-troub.html

-Once set and restart the cluster, themetrics appeared like the other graph and the collectors were no longer clamping.

But there is a lot more that needs to be done

1) Disable unneeded socket metrics
https://knowledge.broadcom.com/external/article/194043/agent-clamping-new-metrics-will-not-be.html
2) Disable CEM
https://knowledge.broadcom.com/external/article/253983/need-hotfix-84-to-perform-108-upgrade.html

 First would be to disable the TESS in the properties file.
introscope.enterprisemanager.tess.enabled=false
Do on all EMs and restart them after the change.

3)Investigate raising two clamps in apm-events-thresholds-config.xml.

</description>
<clamp id="introscope.enterprisemanager.agent.metrics.limit">
<description>
Most properties here put limits on # of metrics.
The last limits metric data. The metric clamping properties support hot config.
Per Agent limit. Takes into account live and historical metrics. 
....<clamp id="introscope.enterprisemanager.agent.error.limit">
<description>
Limits # of Error Events Per Interval
</description>
<threshold value="10"/>
apm-evets-thresholds-config.xml

4) Upgrade to APM 10.7 HF 84 (open a case to get). Must upgrade postgres first.
You have 5 months to upgrade https://knowledge.broadcom.com/external/article/132327/life-cycle-of-1070-sp3.html