How to diagnose failed hard drives in Security Analytics

book

Article ID: 168254

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

The Security Analytics (Solera) appliances come in one of two main configurations.  Stand-a-lone appliance and a head unit with one or more external storage enclosures.  The goal of this article is to help you identify if you have a problem, and if so, what sort of information should you gather before contacting Blue Coat Technical Support.

Some other symptoms you may experience if there are possible hard drive issues are the following:

1.  Predictive Failure in the audit log.
2.  Loud audible alarm coming from the appliance.
3.  Amber or red indicator light on the hard drive.

Resolution

Due to the fact that the Security Analytics appliances have a redundant RAID configuration (RAID 1 and RAID 5), you can often have a drive fail without losing functionality.  If a drive does fail, you may get a syslog or email alert depending on how you have configured the communication logging settings.  If you do think you have an error, use the following methods to confirm:

1.  Use the Audit Log

From within the GUI, go to Settings and then Audit Log.  To filter down just the Hardware messages, in the Advanced Filter section, enter "category=Hardware" and you should just see the Hardware related issues.  Search for "Failure" or "Degraded" or "Failed" or "Predictive".  If you find these messages, support will need the full output of the error.  Below is an example:

Adapter 0; seqNum: 0x0000603e; Time: Sat Nov 22 03:18:24 2014; Event Description: Predictive failure: PD 06(e0x20/s6); Device ID: 6; Enclosure Index: 32; Slot Number: 6;

2.  Use 'lsi-show' to determine which HDD has failed.

The command 'lsi-show' is useful in that it queries the raid controllers and will provide a detailed listing of all physical and logical units regardless of the current hardware state.  A sample lsi-show output can be found in the "Additional Information" section below.  If everything is healthy, you will see all Logical Drives in an "Optimal" status and all Physical Drives will show as "Online".  If a drive has failed, it will show up in this report.  

NOTE:  if you don't have a root access from the CLI, you can SSH as the 'admin' user and use the 'sudo lsi-show' command.  The lsi-show.txt file is also part of the CSR (generated by going to the Settings > System page.

Be prepared with the lsi-show output and any pertinent audit log messages before calling Bluecoat technical support to submit an RMA for a replacement drive.  If you have Dell hardware, we will also need the Dell Service Tag, which can be located on the physical appliance itself or by running the following command from the CLI as the root user:   dmidecode -s system-serial-number

If the failed drive happens to be in one of the Dell external attached storage devices, we will need the Service Tag from the storage enclosure.  This service tag is not stored in the software and must be obtained by going to the physical appliance.