Event: DISK_STATUS Message: Powervault System Health is still occurring
search cancel

Event: DISK_STATUS Message: Powervault System Health is still occurring

book

Article ID: 221199

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

This message means that there still may be a problem with the storage arrays or the controllers.  

Storage Controller A web management GUI console was connected. When checked at the Storage Controller A web management GUI console, it was observed Storage Controller B had failed.  This also applies to hard disks and power supplies.

Disk related messages may appear in a pink banner when accessing the web GUI console of the SA collection device.


Environment

Single redundant controller failure in U144.  This is also known as ME4 or ME4012.

Troubleshooting steps to take:

Analysed data
///disk.health.2021-07-22TKST_112703.log
Hardware : grep HARDWARE /var/log/messages
Hardware : 2021-07-22T01:00:47+09:00 ds5ae8 POWERVAULT[452793]: snlog: sn="##:##:##:##:##:##" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="##.##.##.##" model="R640xl" msg="Powervault System Health: A subcomponent of this component is unhealthy.; "

Cause

Single redundant controller failure with error message in /var/log/messages and a pink banner in the Security Analytics UI.

Resolution

There are several good methods to obtain the current status of a U144 or U840 storage array.  The command pv_monitor.sh has been created to produce the current status of the storage arrays.  pv_monitor.sh -S will collect the current configuration and status of all storage arrays attached.  A support bundle file with the name similar to /home/powervault_logs_2021-09-06_07:15.zip will be produced.  Submit this to technical support to determine the failure and arrange for a replacement.  Inside the powervault log file is a file named store.zip.  It is used to see the errors and current status.

pv_monitor.sh --show disks
will show the current status of all disks in the storage array.  Use 'lsi-show' for disks internal to the sensor.

A few examples -

pv_monitor.sh --show
usage: powervault_monitor.py [-h] [-H] [-I] [-S] [-F] [-l] [-u USER]
                             [-p PASSWORD] [-i IP] [-s SHOW] [-C CONFIG_FILE]
powervault_monitor.py: error: argument -s/--show: expected one argument

pv_monitor.sh --show help
Connecting to https://##.##.##.##
Session Established with https://##.##.##.##
Gathering External Storage Configuration

We have found two solutions to this point.  The first and most important is to upgrade the firmware and then clear the pink banner.  That is covered in https://knowledge.broadcom.com/external/article/238767/update-me4-storage-array-controllers-and.html

If that does not resolve the problem, reboot each Management controller in turn from the Powervault Array Web UI.

In the Powervault Manager, click the System area in the upper menu bar and you should see an option to Restart System and then you'll see a dialog like this:

https://api-broadcom-ca.wolkenservicedesk.com/attachment/get_attachment_content?uniqueFileId=YbYhE2xg0DfieT+0kov4jA==

Choose the options as above and then click OK.

This has resolved the current, voltage, and fan sensor errors on several VA012 powervault storage arrays.

 

To reduce your time to closure, please include a full shipping address the contact name and phone number to expedite the part replacement.  Also, let support know if you need a field tech to replace the part.

Additional Information

You may also ssh into the storage array and run show commands to see the status of the array.  There is also a help command.