Down alarms persist after card has been changed, DOWN alarms are seen when card is changed and they are not cleared until a new discovery is done.
search cancel

Down alarms persist after card has been changed, DOWN alarms are seen when card is changed and they are not cleared until a new discovery is done.

book

Article ID: 332141

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:

  • Alarms exist for a card that is no longer there.
  • Down alarms seen for card / transceiver after its been removed. These alarms persist even after the card/ transceiver has been replaced and next polling has been done. 

Environment

Smarts 10.1.X

Cause

When a card is discovered on a device, the card is given a unique identity (i.e an index) from the device.
SMARTS discovers this device and monitors the card in question with this one index.

If a card is pulled out, then SMARTs monitoring system will try to fetch the card status through the unique index it used to discover the card. But the device agent would return NosuchInstance because when the card is removed and its identity would also be removed from the MIB table. This NosuchInstance is treated as a critical state and SMARTS would flag this as DOWN.

If the removed card is being replaced with a new card in the same slot, then the agent present at the device would have two options
1. Either give the same identity that was given to the old card that was removed.
2. Or give a new identity (i.e a new index).

  • If its the 1st case, then SMARTS Monitoring system would query the card status (with the initial discovered index) and eventually agent would responded with an UP status and SMARTS would clear the alarm subsequently.
  • But if its case 2, then as per the SMARTS inventory, the card that belonged to a particular device has index x, but the agent present in the device has given a new index to the new card. SMARTS does not know about the new index and continues to poll the card with the old index only. If thats the case, SMARTS will continue to flag that card DOWN.

Resolution

Rediscover the device:

When a rediscovery of the device is done, then SMARTS discovery system creates the new card (with the updated index) and removes the old card from the topology. This removal of the old card from the topology would eventually remove the ALARM as well.

If the rediscovery does not work to remove the card, the device may need be deleted and rediscovered.

If the discovery is done by seedfile, then the following steps should be done to refresh the tables and remove any stale connections.

Run the below command to clear the tables in Smarts manually:
./dmctl -s <DomainName> invoke ICF_TopologyManager::ICF-TopologyManager runPreProcessors

Once this is complete, then run the discovery process using seedfile 
./sm_tpmgr -s <domain name> --seedfile=<Insert Full Patch to the seedfile here>


Additional Information

Autodiscovery can be set for devices or set Enable Discovery to run at particular intervals so that the changes may get picked up rather than running a manual discovery each time.  However, if the path changes you will need to delete the device and rediscover.