Event: DISK_STATUS Message: Powervault System Health is still occurring

book

Article ID: 221199

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

This message means that there still may be a problem with the storage arrays or the controllers.  

Storage Controller A web management GUI console was connected. When checked at the Storage Controller A web management GUI console, it was observed Storage Controller B had failed.  This also applies to hard disks and power supplies.

Disk related messages may appear in a pink banner when accessing the web GUI console of the SA collection device.


Cause

Single redundant controller failure with error message in /var/log/messages and a pink banner in the Security Analytics UI.

Environment

Single redundant controller failure in U144.  This is also known as ME4 or ME4012.

Troubleshooting steps to take:

Analysed data
///disk.health.2021-07-22TKST_112703.log
Hardware : grep HARDWARE /var/log/messages
Hardware : 2021-07-22T01:00:47+09:00 ds5ae8 POWERVAULT[452793]: snlog: sn="78:ac:44:34:5a:e8" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="10.10.100.69" model="R640xl" msg="Powervault System Health: A subcomponent of this component is unhealthy.; "
Hardware : 2021-07-22T01:00:48+09:00 ds5ae8 POWERVAULT[452814]: snlog: sn="78:ac:44:34:5a:e8" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="10.10.100.69" model="R640xl" msg="Powervault System Health: A subcomponent of this component is unhealthy.; "
Hardware : 2021-07-22T01:00:50+09:00 ds5ae8 POWERVAULT[452838]: snlog: sn="78:ac:44:34:5a:e8" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="10.10.100.69" model="R640xl" msg="A root expander experienced a communication fault. (enclosure 0, path: B, errorCode: 156 ); "
Hardware : 2021-07-22T01:00:50+09:00 ds5ae8 POWERVAULT[452839]: snlog: sn="78:ac:44:34:5a:e8" i

Resolution

There are several good methods to obtain the current status of a U144 or U840 storage array.  The command pv_monitor.sh has been created to produce the current status of the storage arrays.  pv_monitor.sh -S will collect the current configuration and status of all storage arrays attached.  A support bundle file with the name similar to /home/powervault_logs_2021-09-06_07:15.zip will be produced.  Submit this to technical support to determine the failure and arrange for a replacement.  Inside the powervault log file is a file named store.zip.  It is used to see the errors and current status.

pv_monitor.sh --show disks
will show the current status of all disks in the storage array.  Use 'lsi-show' for disks internal to the sensor.

A few examples -

pv_monitor.sh --show
usage: powervault_monitor.py [-h] [-H] [-I] [-S] [-F] [-l] [-u USER]
                             [-p PASSWORD] [-i IP] [-s SHOW] [-C CONFIG_FILE]
powervault_monitor.py: error: argument -s/--show: expected one argument

pv_monitor.sh --show help
Connecting to https://10.7.18.12
Session Established with https://10.7.18.12
Gathering External Storage Configuration

 

To reduce your time to closure, please include a full shipping address the contact name and phone number to expedite the part replacement.  Also, let support know if you need a field tech to replace the part.

Additional Information

You may also ssh into the storage array and run show commands to see the status of the array.  There is also a help command.