The issue we are looking at is the heap percentage skyrocketing above 30% and causing data gaps and false alarms which are generating SNOW tickets the only way to resolve this quickly is to restart DCMD and ActiveMq on the problem DC's and they go back down so we need to determine why they are getting stuck at being over 30%
Running DX NetOps Performance Management release 22.2.8
The issue appears to occur on different DC's at different times.
The System Health Dashboards for Data Collectors shows excessive Heap usage. The DCs are using more memory in the form of Heap usage than they should. As the problem grows we can see an increase in "dropped poll request" messages in the karaf.log file on the affected DC. Once those messages begin appearing the spread of the data gaps grows until no data is seen from the DC. Only dcmd service stop and restart has helped clear up the issue.
All supported DX NetOps Performance Management releases
Pending specific versions this was introduced in as of September 07, 2023.
To be determined. Pending input from engineering while the issue is under investigation via engineering defect DE577202.
Pending details from engineering regarding a cause. Current as of September 07, 2023.
To be determined. Pending input from engineering while the issue is under investigation via engineering defect DE577202.
Pending details from engineering regarding a cause. Current as of September 07, 2023.