Smarts IP: SMARTS not showing router down alarm although device is down; Smarts only shows Unresponsive alarm, not Down alarm; Smarts IP not correlating Down events correctly
search cancel

Smarts IP: SMARTS not showing router down alarm although device is down; Smarts only shows Unresponsive alarm, not Down alarm; Smarts IP not correlating Down events correctly

book

Article ID: 331687

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:


Smarts IP is not correctly correlating a router unresponsive alert into a router down alert.
Smarts IP not propagating MightBeDown symptom correctly between peer devices cause a down event for device not to be generated

A device is down but is not being shown as down by the Smarts IP software.  It is only being shown as unresponsive.

Environment

VMware Smart Assurance - SMARTS

Cause

A device failing to be correlated to a Down state by the Smarts IP Manager can be caused by the DefaultMaximumNetworkSizeForCorrelation setting of the Smarts IP domain being too large.  This value is used by Smarts to propagate the MightBeDown symptoms between peer systems in a network, the larger the DefaultMaximumNetworkSizeForCorrelation the more MightBeDown symptoms needs to be exchanged for a Down alert.  In a smaller network topology the larger the DefaultMaximumNetworkSizeForCorrelation number the more devices needed to determine a device is down in the IP network.  When a device goes down the MightBeDown of the down device is computed based on the neighbor devices and their IP network relation to the device that is down.

Resolution

DefaultMaximumNetworkSizeForCorrelation is an parameter found in the IPNetwork Class. This parameter is used to prevent propagation of MightBeDown symptom between peer relay devices that are connected through the IPNetwork. These devices are not connected through a network connection or a cable or are not part of the same partition. As shown in the below screenshot




The above can also be found in tpmgr-param.conf located at <BASEDIR>/conf/discovery/tpmgr-param.conf and search for in the conf file. As shown below :




The default value of DefaultMaximumNetworkSizeForCorrelation is set to 10. However this value can be changed based on the Network Size.

For Example :

If the number of devices in the network is 20, then the recommended value for DefaultMaximumNetworkSizeForCorrelation should be 2. This prevents SMARTS from picking up symptoms of neighbors which are far away from this device.

Additional Information

By setting the incorrect value there may be an affect on the Notifications not showing up on the Console.  If the value of DefaultMaximumNetworkSizeForCorrelation is changed, please execute the below command in-order for the changes to take effect :

./sm_tpmgr.exe -s <server_name> -b <broker_name> --load-conf=tpmgr-param.conf