The following alarm is seen in CA Spectrum:
Minor | Jan 6, 2015 3:20:10 PM EST | SpectroSERVER | Device Router of type Rtr_Cisco has triggered an interface reconfiguration every poll cycle for the past half hour. | System | 0x10050 |
Upon reviewing the events that lead up to the above alarm event, the following event was logged every poll cycle:
Jan 6, 2015 3:19:31 PM EST | SpectroSERVER | Device Router of type Rtr_Ciscohas completed an interface reconfiguration. Interface Reconfiguration Trigger: Interface-Stack-Change-Reconfiguration Interface Reconfiguration Status: Success-Reconfiguration-Complete |
System | 0x1001d |
Release: Spectrum 10.3.2
From Spectrum 10.3.2 and up, we have now included a few 'Self-Health" feature on VNM subview menu. In this Subview, we have provided an option to allow Spectrum to automatically adjust value of If_IsAutoCnfgActive should "Excessive Reconfigurations" be detected on an interface. The option is "handle Excessive Interface Reconfigurations". By default this is "no". When changing this to "yes", If Spectrum detects excessive reconfigurations on a device, it will automatically set If_IsAutoCnfgActive to "no". There is also a "age-out" value parameter. Meaning, you can adjust the time at which Spectrum evaluates the if_stack and if_table values of the device, and if it determines that the "flapping" occurrence has abated, and the age-out time has expired, Spectrum will set If_IsAutoCnfgActive back to "yes".
*******************
As mention above, prior to Spectrum 10.3.2, you have to manually manage the value of If_IsAutoCnfgActive.
However, you can apply a manual solution using a combination of Event Rules and Event Procedures to detect the trigger, count the number of times the event was logged for a specific trigger and if it exceeds a specific count within a specific time period, set the appropriate attribute on the model to disable further reconfigurations. In essence this is the manual configuration of the Built-In Self-health feature provided in 10.3.2 and above.
1. The first step is to attach an Event Condition Rule to the 0x1001d event to detect what triggered the model reconfiguration. In the 0x1001d event, the trigger is an enumeration on event variable 1:
{d "%w- %d %m-, %Y - %T"} - Device {m} of type {t} has completed an interface reconfiguration. (event [{e}])
Interface Reconfiguration Trigger: {T InterfaceReconfigurationTriggers 1}
Interface Reconfiguration Status: {T CsInterfaceReconfigurationErrors 2}
2. The InterfaceReconfigurationTriggers file located in the $SPECROOT/SG-Support/CsEvFormat/EventTables directory shows the integer value "3" is enumerated to "Interface-Stack-Change-Reconfiguration":
3 - Interface-Stack-Change-Reconfiguration
4 - Interface-Table-Change-Reconfiguration
5 - Interface-Count-Change-Reconfiguration
Using the above information, we can attach an Event Condition Rule to the 0x1001d event to check the value of event variable 1 to determine the trigger and generate new events for the above listed triggers:
0x1001d E 50 R CA.EventRateWindow, 6, 1860, "0x00010050 -:-","0x10000 " R CA.EventCondition, "({v 1} == {I 3})" , "0xfff00000 -:-","({v 1} == {I 4})" , "0xfff00001 -:-","({v 1} == {I 5})" , "0xfff00002 -:-"
In the above Event Condition Rule attached to the 0x1001d event:
If event variable 1 equals 3 (Interface-Stack-Change-Reconfiguration) generate event 0xfff00000*
If event variable 1 equals 4 (Interface-Table-Change-Reconfiguration) generate event 0xfff00001*
If event variable 1 equals 5 (Interface-Count-Change-Reconfiguration) generate event 0xfff00002*
*NOTE: THESE EVENT ID'S MAY NOT BE THE SAME FOR YOUR IMPLEMENTATION.
The next step is to add Event Rate Counter rules to the 0xfff00000, 0xfff00001 and 0xfff00002 events to count the number of times the event was generated in a specified amount of time. If met, generate new events that will have Event Procedures attached to set the appropriate attribute to disable reconfiguration according to the trigger.
0xfff00000 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00003 -:-"
0xfff00001 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00004 -:-"
0xfff00002 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00005 -:-"
In the above Event Rate Counter rule, 3 events within 1200 seconds will trigger the rule. These can be modified to suite needs. Remembering the default poll interval is 5 minutes.
The following are the Event Procedures that will then set the appropriate attribute to disable model reconfiguration according to the original trigger from the Event Condition rule attached to the 0x100d1 event.
0xfff00003 E 0 A 3,0xfff00000 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x130bc }, \
{ B FALSE })"
0xfff00004 E 0 A 3,0xfff00001 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11f7f }, \
{ B FALSE })"
0xfff00005 E 0 A 3,0xfff00002 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11dd4 }, \
{ B FALSE })"