How to diagnose failed hard drives in Security Analytics

book

Article ID: 168254

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

The Security Analytics (Solera) appliances come in one of two main configurations.  Stand-a-lone appliances and a head unit with one or more external storage enclosures.  The goal of this article is to help you identify if you have a problem, and if so, what sort of information should you gather before contacting Technical Support.

Some other symptoms you may experience if there are possible hard drive issues:

  • An email or syslog event indicating a hardware failure
  • A pink banner displayed in the UI indicating a disk failure
  • A predictive failure in the audit log.
  • A loud audible alarm coming from the appliance.
  • An amber or red indicator light on the hard drive.

Resolution

Due to the fact that the Security Analytics appliances have a redundant RAID configuration (RAID 1 and RAID 5), you can often have a drive fail without losing functionality.  If a drive does fail, you may get a syslog or email alert depending on how you have configured the communication logging settings.  If you think you have an error, use the following methods to confirm:

1.  Use the Audit Log

From within the GUI, go to Settings and then Audit Log.  To filter down just the Hardware messages, in the Advanced Filter section, enter "category=Hardware Events" and you should just see the Hardware related issues.  Search for "Failure" or "Degraded" or "Failed" or "Predictive".  If you find these messages, support will need the full output of the error.  Below is an example:

Adapter 0; seqNum: 0x0000603e; Time: Sat Nov 22 03:18:24 2014; Event Description: Predictive failure: PD 06(e0x20/s6); Device ID: 6; Enclosure Index: 32; Slot Number: 6;

2.  Use 'lsi-show' to determine which HDD has failed.

The command 'lsi-show' is useful in that it queries the raid controllers and will provide a detailed listing of all physical and logical units regardless of the current hardware state.  A sample lsi-show output can be found in the "Additional Information" section below.  If everything is healthy, you will see all Logical Drives in an "Optimal" status and all Physical Drives will show as "Online".  If a drive has failed, it will show up in this report.  Be aware that lsi-show will not provide hard drive information for large storage arrays such as the Dell MD3860 or the NetApp E5600 or the Dell Seagate storage arrays.

NOTE:  if you don't have a root access from the CLI, you can SSH as the 'admin' user and use the 'sudo lsi-show' command.  The lsi-show.txt file is also part of the CSR (generated by going to Settings > System -> Download CSR.  This may take 5-15 minutes and be from 100MB to 1GB+ in size.

  • Drives marked Failed or "Unconfigured(bad)" will need to be replaced. 
  • Drives marked Unconfigured(good) have not been allocated to a virtual disk.  They could be spun up (spinning at 7,200RPM) or spun down to save power. 
  • Drives marked "Unconfigured(good), Spun Up    Foreign" have become disconnected and will need to be imported. The virtual disk will become Optimal or degraded.
  • Virtual disks, or Volumes, marked Offline have more than one disk failed and will need to be rebuilt. 
  • Virtual disks marked Degraded are still running but are no longer redundant.  A disk has failed and should be replace as soon as possible.
  • Virtual disks marked Optimal are redundant and if one disk fails, they will continue function.

lsi-show includes the following -

  • Controller number in the Controller Information paragraph, under the label Id. 
    • Id a0 is for the adapter 0 which manages the drives internal to the sensor. adapters a1 and a2 are for the drives installed in JBOD enclosures external to the sensor.
  • Controller type under Product Name. 
    • Controllers with the label PERC are Dell controllers.  All others are BlueCoat controllers in S500 servers.  There are typically two controllers/adapters for external drives and only one for internal drives.
  • Enclosure Information includes the enclosure Id drive
    • 32 in this case, and the drive slot numbers. There may be 0 enclosures up to 3 attached to each adapter.
  • Logical Drive Information includes the details for each logical/virtual disk. 
    • They are numbered sequentially, include size, status and other information.  A status of Optimal is the only healthy status.
  • Physical Drive Information includes the enclosure id, drive slot, drive status, size, and other manufacturer information.

For example -

Controller Information
Id   Serial No        Firmware             Encl LD   PD   Product Name
a0   76102I4          4.270.00-8168        1    2    26   PERC H730P Mini

Enclosure Information
Id   Drives
32   0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25 

Logical Drive Information
Id   Size         Status       Stripe   Raid Level   Cache
0    5.456 TB     Optimal      64 KB    Primary-5    WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
  by id: 22,23,24,25
  by es: 32:22,32:23,32:24,32:25
1    34.557 TB    Optimal      64 KB    Primary-5    WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
  by id: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
  by es: 32:0,32:1,32:2,32:3,32:4,32:5,32:6,32:7,32:8,32:9,32:10,32:11,32:12,32:13,32:14,32:15,32:16,32:17,32:18,32:19

Physical Drive Information
Id   E:S      Status                         FC       Size           Inquiry
0    32:0     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BL9A
1    32:1     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYFM
2    32:2     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460B4AZ
3    32:3     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W4609PV7
4    32:4     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYL4
5    32:5     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY9N
6    32:6     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYL5
7    32:7     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY3E
8    32:8     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460JEAL
9    32:9     Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460E43S
10   32:10    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY89
11   32:11    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W4608FTB
12   32:12    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY9Z
13   32:13    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYC1
14   32:14    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460B3FB
15   32:15    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY2Z
16   32:16    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W4609Q7S
17   32:17    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY7R
18   32:18    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYAN
19   32:19    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYGV
20   32:20    Unconfigured(good), Spun Up    None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BY95
21   32:21    Unconfigured(good), Spun Up    None     1.819 TB       SEAGATE ST2000NX0453    NSF1W46094DW
22   32:22    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W4608TJ8
23   32:23    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYL1
24   32:24    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYBV
25   32:25    Online, Spun Up                None     1.819 TB       SEAGATE ST2000NX0453    NSF1W460BYD5

Controller Information
Id   Serial No        Firmware             Encl LD   PD   Product Name
a1   76N00DW          4.270.00-8168        2    2    24   PERC H830 Adapter

Enclosure Information
Id   Drives
0    1   4   5   2   7   9   8   11  10  12  3   6  
13   17  18  16  15  23  21  20  19  24  25  14  22 

Logical Drive Information
Id   Size         Status       Stripe   Raid Level   Cache
0    40.019 TB    Optimal      64 KB    Primary-5    WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
  by id: 1,4,5,2,7,9,8,11,10,12,3,6
  by es: 0:1,0:2,0:3,0:4,0:5,0:6,0:7,0:8,0:9,0:10,0:11,0:12
1    40.019 TB    Optimal      64 KB    Primary-5    WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
  by id: 17,18,16,15,23,21,20,19,24,25,14,22
  by es: 13:1,13:2,13:3,13:4,13:5,13:6,13:7,13:8,13:9,13:10,13:11,13:12

Physical Drive Information
Id   E:S      Status                         FC       Size           Inquiry
1    0:1      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ6Z8Y
4    0:2      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYKEDY
5    0:3      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7WJY
2    0:4      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7U3Y
7    0:5      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWHAEY
9    0:6      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7X2Y
8    0:7      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYU0UY
11   0:8      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWGY4Y
10   0:9      Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7T0Y
12   0:10     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWVUGY
3    0:11     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ720Y
6    0:12     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GXA6TY
17   13:1     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWUBUY
18   13:2     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7T2Y
16   13:3     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYS27Y
15   13:4     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWE6GY
23   13:5     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GX5M9Y
21   13:6     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GVSG6Y
20   13:7     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GW1UKY
19   13:8     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYTWPY
24   13:9     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GVUG5Y
25   13:10    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GVSZRY
14   13:11    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GVUZJY
22   13:12    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYS1KY

Controller Information
Id   Serial No        Firmware             Encl LD   PD   Product Name
a2   76N005T          4.270.00-8168        2    2    24   PERC H830 Adapter

Enclosure Information
Id   Drives
0    1   2   4   5   7   8   9   10  11  12  3   6  
13   15  14  17  18  23  22  16  19  24  20  21  25 

Logical Drive Information
Id   Size         Status       Stripe   Raid Level   Cache
0    40.019 TB    Offline      64 KB    Primary-5    WriteThrough, ReadAheadNone, Direct, Write Cache OK if Bad BBU
  by id: 1,2,4,5,7,8,9,10,11,12,3,6
  by es: 0:1,0:2,0:3,0:4,0:5,0:6,0:7,0:8,0:9,0:10,0:11,0:12
1    40.019 TB    Optimal      64 KB    Primary-5    WriteThrough, ReadAheadNone, Direct, Write Cache OK if Bad BBU
  by id: 15,14,17,18,23,22,16,19,24,20,21,25
  by es: 13:1,13:2,13:3,13:4,13:5,13:6,13:7,13:8,13:9,13:10,13:11,13:12

Physical Drive Information
Id   E:S      Status                         FC       Size           Inquiry
1    0:1      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GW7VGY
2    0:2      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYS2DY
4    0:3      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ786Y
5    0:4      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7T5Y
7    0:5      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GXHPLY
8    0:6      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYS2UY
9    0:7      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7T3Y
10   0:8      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ78NY
11   0:9      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWHJHY
12   0:10     Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWHBTY
3    0:11     Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYGNZY
6    0:12     Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GY4HNY
15   13:1     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWP35Y
14   13:2     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWGYNY
17   13:3     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYMETY
18   13:4     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ8RXY
23   13:5     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GX8MLY
22   13:6     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWH3VY
16   13:7     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWHE2Y
19   13:8     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ76VY
24   13:9     Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GYTYKY
20   13:10    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GZ7ADY
21   13:11    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GXVRSY
25   13:12    Online, Spun Up                None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GWH8JY

 

A few example of drive status:

The drive is not being used and is not spinning.

37   35:2     Unconfigured(good), Spun down  None     3.638 TB       HGST    HUS726T4TAL5205 D40HV6GA2U2R

 

The drive is not being used and is spinning -

20   32:20    Unconfigured(good), Spun Up    None     1.819 TB       SEAGATE ST2000NX0453    NSF1S460P2FE

The virtual disk has a failed component -

0    931.0 GB     Degraded     64 KB    Primary-1    WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU

One virtual disk has failed and is not functional.  It likely has two drives which have failed.

2    4.545 TB     Offline      64 KB    Primary-5    WriteBack, ReadAdaptive, Direct, Write Cache OK if Bad BBU

This drive is still recognized but has failed -

1    0:1      Failed                         None     3.638 TB       HGST    HUS726040AL5215 D7J0N8GW7VGY

The drive has been detached and re-attached.  All drives on this controller will need to be imported by controller number.  For example, megacli -cfgforeign -import -a1

74   77:0     Unconfigured(good), Spun Up    Foreign  3.638 TB       TOSHIBA MG04SCA40EN     DS0737N0A1E6FVNC

 

Be prepared with the lsi-show output and any pertinent audit log messages before calling technical support to submit an RMA for a replacement drive.  If you have Dell hardware, we will also need the Dell Service Tag, which can be located on the physical appliance itself or by running the following command from the CLI as the root user:   dmidecode -s system-serial-number.

If the failed drive happens to be in one of the Dell external attached storage devices, we will need the Service Tag from the storage enclosure.   The serial number for all enclosures can be obtained with:

megacli -encinfo -aall | egrep "adapter|Device ID|Physical|Vendor Specific"




When opening a new case, be sure to include the shipping address with contact name and phone number. Also, we will need to know if you will replace the drive yourselves or do you want a field tech to do the work.