We are facing an issue with APM application where there are data interruptions frequently on the CDV dashboard.
Release : 10.7.0
Customer did not have memory settings in EMService.conf correct. Had default of 512 and 1024 . So resource-starved
What was done.
-Reviewed issue. Not seeing graphics as expected. (Compared to one working with another scale.
-Looked at Status Monitor and was clamping. Seeing a lot of slow collector messages which were wiping out the environment
-Looked at the perflog log and showed had low heap and was getting OOM heap dumps.
-Issue was that the 12388MB on MOM and 8912 on the collector was set in the lax file instead of EMService.conf. That is covered in
APM 10.x - Troubleshooting and Best Practices
-Once set and restart the cluster, themetrics appeared like the other graph and the collectors were no longer clamping.
But there is a lot more that needs to be done
1) Disable unneeded socket metrics
Agent Clamping - New metrics will not be accepted
2) Disable CEM
Need Hotfix 84 to perform 10.8 upgrade
First would be to disable the TESS in the properties file.
introscope.enterprisemanager.tess.enabled=false
Do on all EMs and restart them after the change.
3)Investigate raising two clamps in apm-events-thresholds-config.xml.
</description>
<clamp id="introscope.enterprisemanager.agent.metrics.limit">
<description>
Most properties here put limits on # of metrics.
The last limits metric data. The metric clamping properties support hot config.
Per Agent limit. Takes into account live and historical metrics.
....<clamp id="introscope.enterprisemanager.agent.error.limit">
<description>
Limits # of Error Events Per Interval
</description>
<threshold value="10"/>
apm-evets-thresholds-config.xml