How can I have CA Spectrum automatically disable device reconfiguration after excessive reconfigurations are detected?

book

Article ID: 182895

calendar_today

Updated On:

Products

CA Spectrum CA eHealth

Issue/Introduction

The following alarm is seen in CA Spectrum:

Minor Jan 6, 2015 3:20:10 PM EST SpectroSERVER Device Router of type Rtr_Cisco has triggered an interface reconfiguration every poll cycle for the past half hour. System 0x10050


Upon reviewing the events that lead up to the above alarm event, the following event was logged every poll cycle:

Jan 6, 2015 3:19:31 PM EST SpectroSERVER Device Router of type Rtr_Ciscohas completed an interface reconfiguration.
Interface Reconfiguration Trigger: Interface-Stack-Change-Reconfiguration
Interface Reconfiguration Status: Success-Reconfiguration-Complete
System 0x1001d

 

 

 

Cause

By default, Spectrum attempts to automatically manage changes in interface configuration by evaluating the if_stack and if_table mibs on the device. If changes to these mib values is detected, Spectrum will reconfigure the interface model, and adjust (make-or-break) device connections as needed. The attribute that determines if this feature is active is the interface attribute "If_IsAutoCnfgActive" 0x11dd4 set to "Yes". 

If the mib information is constantly changing over a period of time (a.k.a flapping interface) then the if_stack and if_table values would be changing constantly. hence Spectrum would be forced to reconfigure the interface model on a constant basis. This is a drain on CPU and MEM resources so an alarm 0x10050 is set up to warn you of such an occurrence on you interfaces. 

Prior to Spectrum 10.3.2, if you notices excessive reconfiguration on an interface model as per high count of 0x10050 alarms, you had to manually adjust If_IsAutoCnfgActive to "no" on the device to stop all interface reconfiguration. While this may stop the "Excessive Reconfiguration Detected" alarm, the real issue is now, Spectrum will not be able to automatically asjut interface connection changes. Hence if any devices are physically moved off or on to the port, you would need to intervene manually to reconfigure the interface, or rediscover connections on the device. 

Environment

Release: Spectrum 10.3.2

Resolution

From Spectrum 10.3.2 and up, we have now included a few 'Self-Health" feature on VNM subview menu. In this Subview, we have provided an option to allow Spectrum to automatically adjust value of If_IsAutoCnfgActive should "Excessive Reconfigurations" be detected on an interface. The option is "handle Excessive Interface Reconfigurations". By default this is "no". When changing this to "yes", If Spectrum detects excessive reconfigurations on a device, it will automatically set If_IsAutoCnfgActive to "no". There is also a "age-out" value parameter. Meaning, you can adjust the time at which Spectrum evaluates the if_stack and if_table values of the device, and if it determines that the "flapping" occurrence has abated, and the age-out time has expired, Spectrum will set If_IsAutoCnfgActive back to "yes". 




*******************

As mention above, prior to Spectrum 10.3.2, you have to manually manage the value of If_IsAutoCnfgActive. 

However, you can apply a manual solution using a combination of Event Rules and Event Procedures to detect the trigger, count the number of times the event was logged for a specific trigger and if it exceeds a specific count within a specific time period, set the appropriate attribute on the model to disable further reconfigurations. In essence this is the manual configuration of the Built-In Self-health feature provided in 10.3.2 and above. 

1. The first step is to attach an Event Condition Rule to the 0x1001d event to detect what triggered the model reconfiguration. In the 0x1001d event, the trigger is an enumeration on event variable 1:

{d "%w- %d %m-, %Y - %T"} - Device {m} of type {t} has completed an interface reconfiguration. (event [{e}])

Interface Reconfiguration Trigger: {T InterfaceReconfigurationTriggers 1}

Interface Reconfiguration Status: {T CsInterfaceReconfigurationErrors 2}


2. The InterfaceReconfigurationTriggers file located in the $SPECROOT/SG-Support/CsEvFormat/EventTables directory shows the integer value "3" is enumerated to "Interface-Stack-Change-Reconfiguration":

3 - Interface-Stack-Change-Reconfiguration

4 - Interface-Table-Change-Reconfiguration

5 - Interface-Count-Change-Reconfiguration

 

Using the above information, we can attach an Event Condition Rule to the 0x1001d event to check the value of event variable 1 to determine the trigger and generate new events for the above listed triggers:


0x1001d E 50 R CA.EventRateWindow, 6, 1860, "0x00010050 -:-","0x10000 " R CA.EventCondition, "({v 1} == {I 3})" , "0xfff00000  -:-","({v 1} == {I 4})" , "0xfff00001  -:-","({v 1} == {I 5})" , "0xfff00002  -:-"


In the above Event Condition Rule attached to the 0x1001d event:

If event variable 1 equals 3 (Interface-Stack-Change-Reconfiguration) generate event 0xfff00000*

If event variable 1 equals 4 (Interface-Table-Change-Reconfiguration) generate event 0xfff00001*

If event variable 1 equals 5 (Interface-Count-Change-Reconfiguration) generate event 0xfff00002*


*NOTE: THESE EVENT ID'S MAY NOT BE THE SAME FOR YOUR IMPLEMENTATION.

The next step is to add Event Rate Counter rules to the 0xfff00000, 0xfff00001 and 0xfff00002 events to count the number of times the event was generated in a specified amount of time. If met, generate new events that will have Event Procedures attached to set the appropriate attribute to disable reconfiguration according to the trigger.

0xfff00000 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00003  -:-"

0xfff00001 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00004  -:-"

0xfff00002 E 0 R CA.EventRateCounter, 3, 1200, "0xfff00005  -:-"


In the above Event Rate Counter rule, 3 events within 1200 seconds will trigger the rule. These can be modified to suite needs. Remembering the default poll interval is 5 minutes.

The following are the Event Procedures that will then set the appropriate attribute to disable model reconfiguration according to the original trigger from the Event Condition rule attached to the 0x100d1 event.

*The following will log the event, execute the procedure and also alarm:

0xfff00003 E 0 A 3,0xfff00000 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x130bc }, \
{ B FALSE })"

0xfff00004 E 0 A 3,0xfff00001 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11f7f }, \
{ B FALSE })"

0xfff00005 E 0 A 3,0xfff00002 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11dd4 }, \
{ B FALSE })"

*The following will log the event and execute the procedure, but there will be no alarm:

0xfff00003 E 0 P 3,0xfff00000 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x130bc }, \
{ B FALSE })"

0xfff00004 E 0 P 3,0xfff00001 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11f7f }, \
{ B FALSE })"

0xfff00005 E 0 P 3,0xfff00002 P "WriteAttribute( \
{ C CURRENT_MODEL }, \
{ H 0x11dd4 }, \
{ B FALSE })"

 

Additional Information

Interface Reconfigurations: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=132935

Spectrum best practices for hardware changes: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=130045

Self-Health monitoring: https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/spectrum/10-4/administrating/spectroserver-performance-administration/self-health-monitoring.html

Attachments