During a major outage, we did not receive an expected critical alarm on a model. No alarm was asserted on the model as expected. Spectrum Fault Isolation not suppressing or asserting alarms as expected.
In the following screen shot, the circled router model was down and should have alarmed Critical with the "DEVICE HAS STOPPED RESPONDING TO POLLS" alarm. However, Spectrum did not assert a Critical alarm. The model stayed green but could not be contacted by Spectrum.
The 0x00010d35 event associated with the "DEVICE HAS STOPPED RESPONDING TO POLLS" alarm was modified as follows:
0x00010d35 R CA.EventPair, 0x10d30, "0xfff0000e -:-", 600 R CA.EventCombo, "0xfff00030 -:-", 300, "0x10d30 -:-"
Out of the box, the 0x00010d35 event is defined in the $SPECROOT/SS/CsVendor/Cabletron/EventDisp file as follows:
0x00010d35 E 75 A 3, 0x00010009,N
Modifying the out of the box events may have undesirable results. The underlying Fault Isolation, Impact Analysis and Fault Suppression code is looking for specific events with specific conditions and specific probable cause codes in order to function properly. Changing any one of these from the default may have undesirable results.
In the above scenario, after changing the 0x00010d35 event back to the out of the box definition and replicating the outage, Spectrum alarmed as expected: