Management Agent Lost alarms are generating on devices even though they are actually up.
Release: 10.3.x, 10.4.x
Component: SPCAEM
The problem is that the agent is either unable to process the request because the request id is too high or the SNMP agent on the device is overloaded and is dropping SNMP.
You can set the maximum request id used by Spectrum for sending SNMP requests. Edit the $SPECROOT/SS/.vnmrc file and add the following:
max_snmp_request_id=16000000
We have found this to be a reasonable value for most installations. You may need to change this. Please keep in mind that if your SpectroSERVER generates a lot of SNMP requests, that setting this value too low may cause SpectroSERVER performance problems.
To determine if this issue applies to you, run a sniffer trace and review the trace in wireshark. Take a look at the request id and see if the responses are not being responded to by the device. Here is a sample of what you might see:
Notice the agent does not respond to the SNMP requests with a high request-id but it does respond to the SNMP request with a low request-id:
Notice here, the device is actually sending traps, but is not responding to the SNMP requests with a high request-id:
This may also be due to SNMP agents SNMP queue set too low. Try increasing the agents SNMP queue. We have seen SNMP packets dropped when agents have a queue set to lower than 25.
show snmp
example output:
584 SNMP packets output
0 Too big errors (Maximum packet size 1480)
2 No such name errors
0 Bad values errors
0 General errors
258 Response PDUs
326 Trap PDUs
SNMP logging: enabled
Logging to <IP_Address>, 0/25, 309 sent, 17 dropped.
In this example, the 0/25 in the last line means that no SNMP traps are currently queued for transmission, and the queue can accept up to 25 messages at once. You can use the number of dropped SNMP traps to verify whether your queue is too small by seeing if this number grows rapidly over time. In this case, the Trap PDUs line tells you that the router has tried to send 326 traps. Of these, the last line tells you that it has successfully sent 309 and dropped 17. This is a five percent drop rate, which suggests that the queue depth should probably be increased.