We are seeing Management Agent Lost alarms in Spectrum.
How do we trouble shoot the root cause?
Release : Any
Component: SpectroSERVER
Management Agent Lost alarms a asserted when Spectrum loses SNMP contact with a model.
By default, when Spectrum send out an SNMP request to a model, it will wait 3 seconds for a response. If a response is not received within three seconds, Spectrum will resend the SNMP request. Spectrum will resend the SNMP request 3 times by default waiting 3 seconds between retries.
If no response is received from the retries, Spectrum will ping the model. If a ping response is received, Spectrum will assert the Management Agent Lost alarm which indicates the device is still responding to ping but not SNMP.
Some common root causes:
- Spectrum is not using a correct SNMP Community String for this model
- There is an access list on this model preventing Spectrum from getting SNMP
- The is a network latency where the device is responding to SNMP, however, it is taking longer than 3 seconds for the response to reach Spectrum.
- The device is taking longer than 3 seconds to respond to the SNMP request
- The device is to busy to respond to SNMP
The BEST tool to trouble shoot a Management Agent Lost alarm is to use a sniffer (Wireshark on Windows. tcpdump on Linux) to sniff the SNMP request leaving the SpectroSERVER to the device and if/how the device is responding.
Then, based on the analysis of the sniffer trace, take appropriate action to correct.