Why does Spectrum Report Manager (SRM) availability report have devices that show down time (or that it was unavailable) that does not match the poll interval in SPECTRUM?
search cancel

Why does Spectrum Report Manager (SRM) availability report have devices that show down time (or that it was unavailable) that does not match the poll interval in SPECTRUM?

book

Article ID: 51805

calendar_today

Updated On:

Products

Spectrum Network Observability

Issue/Introduction

How come the Spectrum Report Manager (SRM) availability report has devices that show down time (or that it was unavailable) that does not match the poll interval in SPECTRUM?

Availability reports show a device is unavailable for 30 seconds but the poll interval is 300 seconds (5 minutes) on the device

How does SPECTRUM determine if a device is available or not for availability reports?

Environment

DX NetOps Spectrum all currently supported releases

Resolution

Basically SPECTRUM determines if a device is down based on its ability to maintain communication with a device.  SNMP is the primary means of communication with ICMP as the backup means to check if a device is available.  If SPECTRUM does not receive responses from a device, then it will report the device as down.  When contact is regained, the device is back up

Device connectivity or availability is determined by polling the device using SNMP (i.e. snmpget) and ICMP (i.e. ping) protocols.

The polling method will use a combination of the SNMP Communication Attributes. You can tune SPECTRUM's overall SNMP communications by changing the values of the attributes in the Attribute Editor's SNMP Communication folder. The following attributes define how SPECTRUM communicates with a device:

Community String
Lets the SpectroSERVER communicate with devices on your network.

DCM Timeout (ms)
The number of milliseconds the polling agent will wait for a response from the device before timing out.

DCM Retry Count
Specifies the number of times the SpectroSERVER retries to establish device communication after the DCM timeout value expires.

Polling Interval (sec)
The number of seconds between polls SPECTRUM makes to devices.

Note: Increasing this number results in less SNMP-related traffic on your network and a smaller load on the SpectroSERVER. Decreasing this number for mission critical devices and interfaces lets you see updated information about these devices in OneClick more often. This can improve your ability to see potential issues on the network before they affect network performance. A decreased Polling Interval will result in more SNMP network traffic generated by SPECTRUM.

In its basic form, if the first poll is unsuccessful, the DCM values are used for retries, if all of the SNMP retries fail an attempt will be made to ping.  If the ping is successful the resulting event and alarm will be for management agent lost.  If the ping fails  then the resulting event and alarm will be the device has lost contact.

Spectrum fault isolation can change the picture somewhat depending on the topology mapping, port fault correlation and live pipes. Once a device is verified as down, Spectrum will proactively poll adjacent devices without waiting for the Poll Interval. So on average, an outage of 5 connected network devices should be detected in roughly (Poll Interval / 5). The reverse is true as well once the fault is resolved and the devices come back up.

If another tool is only detecting an outage based upon poll interval, then there will be a difference as the other tool can only report in multiples of the poll interval .for the device.

Spectrum Report Manager (SRM) uses the events generated by the SpectroSERVER to determine availability and outages that are displayed in the reports. 

Additional Information

This is discussed in further detail in the following section of the documentation:

TechDocs : DX NetOps Spectrum 24.3 : Modeling and Managing Your IT Infrastructure Administrator