False "DEVICE HAS STOPPED RESPONDING TO POLLS" seen on Cisco ASR 9K device models in Spectrum
search cancel

False "DEVICE HAS STOPPED RESPONDING TO POLLS" seen on Cisco ASR 9K device models in Spectrum

book

Article ID: 270721

calendar_today

Updated On:

Products

DX NetOps CA Spectrum

Issue/Introduction

The cisco ASR 9k platform the SNMP server service can occasionally crash. It restarts automatically on crash so should not be an issue as long as the snmp poller retries the request.

But due to this we see a large number of false positives for "device has stopped responding to polls"

Previously we also saw a large number of "CHASSIS DOWN" and "BLADE STATUS UNKNOWN" but they disappeared after I disabled EnableEntityModuleModeling on these devices.

In every single case, if you poll the device manually from Spectrum it shows as success and the event clears. If we do not poll it manually the event will normally clear within 1 or 2 automatic polls.

I have tried mitigating this by setting a high timeout value and a polling interval of 600 (but I am aware that Spectrum will poll the device based on user clicks in OneClick and when updating other information collected from the device in addition to the scheduled polling cycle)

We currently work around this by having a 4 minute delay filter on all alarms from ASR 9k-nodes but this is an operational risk and we would like to eliminate these false positives entirely. 


Environment

Release : DX NetOps Spectrum 21.2.10
Component: Spectrum Alarms

Resolution

This issue is addressed in DX NetOps Spectrum 21.2.10 with the 21.02.10.D209 patch.

Additional Information

This issue is resolved in DX NetOps 22.2.0 and above.