Smarts IP: Getting alarms on the wrong power supply on UCS chassis
search cancel

Smarts IP: Getting alarms on the wrong power supply on UCS chassis

book

Article ID: 331772

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:


Customer's been getting power supply down alarm for the wrong power supply.
Customer has two power supplies in the chassis.  They've decided to take one of the power supply out of the chassis.  In turn, they are getting a power supply down alarm on the remaining power supply that's installed on the chassis.

Environment

VMware Smart Assurance - SMARTS

Cause

First, check how is the power supply discovered.  The power supply that we are using for example is discovered using the Containment-CiscoUCS-CSeries-Driver.  Looking at the walk of the device, if when psu-1 was plugged in, it would've looked like something as follows:

.1.3.6.1.4.1.9.9.719.1.15.12.1.19.6: 0
.1.3.6.1.4.1.9.9.719.1.15.56.1.2.1: sys/rack-unit-1/psu-1
.1.3.6.1.4.1.9.9.719.1.15.56.1.2.2: sys/rack-unit-1/psu-2
.1.3.6.1.4.1.9.9.719.1.15.56.1.3.1: psu-1
.1.3.6.1.4.1.9.9.719.1.15.56.1.3.2: psu-2
.1.3.6.1.4.1.9.9.719.1.15.56.1.4.1: 0


Notice how the index number is 1 and 2.  When the power supply is discovered, it's using the "PowerSupply_Fault_CiscoUCSChassis::I-PowerSupply_Fault_CiscoUCSChassis-PWR-host-192.168.20.9/1" instrumentation.  It shows that we are using cucsEquipmentPsuOperState (.1.3.6.1.4.1.9.9.719.1.15.56.1.7) to do the fault monitoring. 

The following is taken from the problem host walk:

.1.3.6.1.4.1.9.9.719.1.15.12.1.19.6: 0
.1.3.6.1.4.1.9.9.719.1.15.56.1.2.1: sys/rack-unit-1/psu-2
.1.3.6.1.4.1.9.9.719.1.15.56.1.3.1: psu-2
.1.3.6.1.4.1.9.9.719.1.15.56.1.4.1: 0


Notice how psu-1 is no longer there and psu-2 is now in the place of index 1.  Some how, the MIB information that should have been there for index 2 got shifted to index 1 because index 2 is no longer available in the MIB.

Resolution

There is no work around for this.  Please contact the vendor to report this issue.