Smarts IP: What does IsEveryNeighborUnresponsive mean?
search cancel

Smarts IP: What does IsEveryNeighborUnresponsive mean?

book

Article ID: 304071

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:

Smarts IP: What does IsEveryNeighborUnresponsive mean?


Smarts IP: How does Smarts determine the value of IsEveryNeighborUnresponsive?


Smarts IP Availability Manager does not provide a description for the IsEveryNeighborUnresponsive attribute

Environment

VMware Smart Assurance - SMARTS

Resolution

IsEveryIPUnresponsive is a boolean attribute that is set to TRUE or FALSE and plays an important role in Root Cause Analysis (RCA) of UnitaryComputerSystem::Down events and Partition::Down events.

How this works :

  -----------                  ----------                       ------------
| Router A |----------| Router B |--------------| Switch A |
  -----------                  ----------                       ------------

  • Router B has 2 neighbors Router A and Switch A.
  • Both Switch A and Router B are Responsive to ICMP and SNMP polling (both are Active)
  • IsEveryNeighborUnresponsive = FALSE for Router B (both of its neighbors are alive)
  • Now, if Switch A and Router A are "Unresponsive", then  IsEveryNeighborUnresponsive = TRUE for Router B

Two scenarios:

  1. Router Down events:
     

    If Router A, B and Switch A are all part of the same Partition, AND if Router B is Unresponsive due to some failure, but both of its neighbors are Responsive .
    IsEveryNeighborUnresponsive = FALSE (as neighbors Router A and Switch A are alive)

    The following is the logic used to compute this attribute:

    computed attribute float AprioriProbability_Down
            = (IsUnresponsive && (HasNoPartition || !IsEveryNeighborUnresponsive)) ? 0.001 : 0.0;

    -->   = (TRUE && (FALSE || !FALSE)
    -->   = (TRUE && (FALSE || TRUE)
    -->   = (TRUE && TRUE) ==> TRUE (Trigger Down event)

  2. Partition Down events:

     
    If Router A, Router B and Switch A are part of Partition Z, if all devices in partition goes down (i.e Router A,B and Switch A are Unresponsive), then
    IsEveryNeighborUnresponsive = TRUE for ALL DEVICES. In this case, the individual Down events are suppressed, and a Partition::DOWN event is triggered.

    The following is the logic used to compute this attribute:

computed attribute float AprioriProbability_Down
        = (IsUnresponsive && (HasNoPartition || !IsEveryNeighborUnresponsive)) ? 0.001 : 0.0;

-->   = (TRUE && (FALSE || !TRUE)
-->   = (TRUE && (FALSE || FALSE)
-->   = (TRUE && FALSE) ==> FALSE (Suppress Down event)