We are facing an issue with APM application where there are data interruptions frequently on the CDV dashboard.
Release : 10.7.0
Customer did not have memory settings in EMService.conf correct. Had default of 512 and 1024 . So resource-starved
What was done.
-Reviewed issue. Not seeing graphics as expected. (Compared to one working with another scale.
-Looked at Status Monitor and was clamping. Seeing a lot of slow collector messages which were wiping out the environment
-Looked at the perflog log and showed had low heap and was getting OOM heap dumps.
-Issue was that the 12388MB on MOM and 8912 on the collector was set in the lax file instead of EMService.conf. That is covered in
APM 10.x - Troubleshooting and Best Practices
-Once set and restart the cluster, themetrics appeared like the other graph and the collectors were no longer clamping.
But there is a lot more that needs to be done
1) Disable unneeded socket metrics
Agent Clamping - New metrics will not be accepted
2) Disable CEM
Need Hotfix 84 to perform 10.8 upgrade
First would be to disable the TESS in the properties file.
Do on all EMs and restart them after the change.
3)Investigate raising two clamps in apm-events-thresholds-config.xml.
<clamp id="introscope.enterprisemanager.agent.metrics.limit">
Most properties here put limits on # of metrics.
The last limits metric data. The metric clamping properties support hot config.
Per Agent limit. Takes into account live and historical metrics.
....<clamp id="introscope.enterprisemanager.agent.error.limit">
Limits # of Error Events Per Interval
<threshold value="10"/>