How to configure Spectrum so that 'Unresolved Faults' generate on a device rather than the Fault Isolation Application Model
book
Article ID: 187467
calendar_today
Updated On:
Products
CA Spectrum
Issue/Introduction
When all devices in a Fault Domain become unreachable the SpectroSERVER will generate an Unresolved Fault as it is unable to pinpoint the exact cause of the outage.
A 'Fault Domain' is a Group of connected devices. In an ideal world, all your devices would be modeled in the topology showing every connection along the way. In most cases, due to circumstances beyond our control (devices connected through MPLS, another provider
for which we do not have access to discover) we end up with pockets of connected devices.
example: These 4 devices form a fault domain
When this group of connected devices goes down and they are all unreachable the SpectroSERVER is unable to pinpoint an exact cause.
It could be the Sim35942 that went down or an unmodeled device in the cloud that went down. When this happens the devices are all suppressed and a Critical Unresolved Fault alarm is generated which lists the devices in the fault domain.
The Default Behavior is to create the alarm on the Fault Isolation application model.
example: All devices suppressed
We can see the Critical 'Unresolved Fault' alarm is generated on the Fault Isolation Application Model
This can be changed to create the Critical 'Unresolved Fault' Alarm on one of the Devices in the Fault Domain
The Device Chosen will be the one with the highest Criticality attribute value. You can manually select a preferred device and raise the criticality value so it is higher then the other devices in the Fault Domain. If the values are all the same the SpectroSERVER chooses the device with the lowest model handle value.
Scenario:
When Sim35925 is polled and found to not be responding Fault Isolation logic will check its connected neighbors to see if any of them are up. In this case (below) Sim35942 is up and responding. The Sim35925 device will have been found to be the cause and a Critical Device Has Stopped Responding alarm will be generated on it. The remaining two devices behind the Sim35925 device are suppressed as they are unreachable due to Sim35925 being down.
Environment
Release : 22.2, 23.3
Component : Spectrum Core / SpectroSERVER
Resolution
The Critical 'Unresolved Fault' alarm can be configured to generate on a device within the Fault Domain. This is often favorable as it provides a visual indication when looking at the group of devices.
To change this behavior locate the 'Fault Isolation' view on the VNM model. Look for the 'Unresolved Fault Alarm Disposition' and change this from 'Fault Isolation Model' to 'Device In Fault Domain'. The SpectroSERVER on the next outage will generate the alarm on one of the models in the Fault Domain.
example: Unresolved Fault created on a device in the Fault Domain
Note: The other 3 devices remain in a suppressed state which is correct behavior.
Controlling which device receives the Unresolved Fault alarm:
The SpectroSERVER will look for the device whose Criticality/0x1290c attribute contains the highest value. If the devices contain the same value the SpectroSERVER will then choose the device with the lowest model_handle.
In this case, we can see Sim18324 has the highest Criticality value