search cancel

Suddenly majority of Performance Management device polls timeout


Article ID: 252539


Updated On:


CA Performance Management - Usage and Administration DX NetOps


We believe the Data Collector crashed or something else happened, causing Spectrum to be flooded with "device polling statistics threshold violation alarms". 

When we verified the "calculated metrics per second" graph in the DX Netops Portal we see indeed a serious drop.
At this point, we don't think it is a network issue on the customer's network as the polling resumed to a normal state only after stopping and starting the dcmd/activemq service. 



Release : Any supported release


Logs indicate we send snmp and we receive timeouts:

2022-10-12T12:20:00,002 | WARN  |  300000-thread-1 | ThrottleCounterManager           | or.common.ThrottleCounterManager  279 | 39 - - 21.2.12.RELEASE-457 |  | Polls not sent due to TIMEOUT: /10.x.x.x=3760

The next step would be to collect some traffic with tcpdump to investigate what is actually happening on the wire. If snmp leaves DC at all.

If you see this again; choose an affected device to focus on and collect traffic for at least a couple of poll cycles for further analysis:

tcpdump -envi any -w /tmp/device_poll_timing_out.pcap host <IP>