I have been troubleshooting this for a while. Initially I thought it was a WILYZOS issue at first, but it doesn't seem to be.
This is because once restarted, WILYZOS sends metrics to the Enterprise Manager.
The CICS regions in our environment are restarted daily at 03:30am and once they are back up, there doesn't seem to be a connection between SYSVIEW and WILYZOS or SYSVIEW and the Enterprise Manager, until WILYZOS is restarted.
As part of the CICS regions restart, SYSVIEW is also warm started. As a workaround, I have automation restart WILYZOS once the CICS regions and SYSVIEW are back up.
Additionally, I have checked the syslog and compared the startup of the region to the SYSVIEW address space,
There seems to be a discrepancy I can't explain; the SYSVIEW address space was last started on August 1, 2020 at 22.32.27, but the CICS Region was restarted today at 04:00:31.
04.01.20 S0204558 +GSVC720I CA SYSVIEW for CICS request started
04.01.20 S0204558 +GSVC714I INITPARM parms: GSVI=GSVI,USERID=*,START=*,SSID=*
04.01.20 S0204558 +GSVC713I Starting transaction GSVI
04.01.20 S0204558 +GSVC720I CA SYSVIEW for CICS request complete
What I would like to know is: Is it normal behaviour to have to restart WILYZOS after SYSVIEW is restarted, or am I possibly missing a configuration setting in SYSVIEW/WILYZOS?
Release : 10.7 z/OS
Component : APM Agents
The CEAPM in this situation was producing duplicate log messages. It was found that two LPARS were logging to the same logfile, thus causing confusion during troubleshooting. The CEAPM product was not designed to run multiple LPARS off a single installation.
The actual root cause was too many metrics coming through.
So a reasonable clamp limit really depends on what the user wishes to monitor. Please work with your Enterprise Manager team on what metrics they deem important and unimportant.
Short term quick solution is to raise the clamp limit to 140,000 (double what you have set now) and then figure out a longer term solution on what metrics to disable.
You could have one MQ user who wishes to just have response times only and not care about other metrics. These can be turned off in the pbd files on the agent side. So that user may lower their clamp limit.
Another user may wish to have their MQ environment produce a ton of metrics, so their clamp limit would be higher.
Link to our sizing and performance guide.
Link to Enterprise Manager Workload Clamping sizing information.