Identifying the HDD Errors
This document is intended to define a standardized Return Maintenance Authorization (RMA) process a customer can follow when they have a hard drive issue/error that warrants an RMA for Symantec Edge SWG (ProxySG) and Symantec Advanced Secure Gateway (ASG).
There are two types of error messages displayed by SGOS for a bad Hard Disk Drive (HDD):
1.status: bad
2.status: bad (failing)
- This error indicates a drive has failed and experienced a SMART trip. This kind of error could occur if a drive successfully responded to the initialization queries, but failed the first read request. The incomplete initial read request would cause SGOS to mark the drive as bad, as well as triggering a SMART trip on the drive.
This can be observed in the Management Console, Sysinfo, and Event Log.
Sysinfo Example
Hardware Information
URL_Path /Diagnostics/Hardware/Info
Hardware Information
Number of physical CPUs: 1
Number of cores: 2
CPU frequency: 2600 MHz
Storage: 4 drives
Disk in slot 1: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97YSW status: present
Disk in slot 2: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97Y6L status: present
Disk in slot 3: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97Y6P status: present
Disk in slot 4: 0 GB SEAGATE ST3500620SS, rev:0003 serial:9QMBWVSF status: bad
Disk in slot 5: empty
• Event Log Example
2013-12-20 19:11:23-00:00UTC “Health Monitor (CRITICAL): Disk 4 Status is 'error'" 9 6F0102:1 notifier.cpp:112
Starting with SGOS 6.x, if SMART is triggered on a HDD, the operating system will report the following warning: status: present (failing)
This warning is representative of a predictable failure and indicates imminent drive failure. Depending on which attribute caused SMART to trip, the drive may continue to function normally, or it may fail in a short period of time. Unfortunately there is no way to accurately predict the time before the drive completely fails.
In this case please contact Symantec and initiate the RMA process for a replacement HDD.
NOTE: The warning status: offline (failing) is the same as status: present (failing) just that the drive has been manually taken offline using the CLI command:
ProxySG# disk offline <disk number>
The following examples show in the Management Console, Sysinfo, and Event Log when SMART has triggered and the warning "status: present (failing)" is being reported by SGOS.
Sysinfo Example A - Disk Status
Hardware Information
URL_Path /Diagnostics/Hardware/Info
Hardware Information
Number of physical CPUs: 1
Number of cores: 2
CPU frequency: 2600 MHz
Storage: 4 drives
Disk in slot 1: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97YSW status: present
Disk in slot 2: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97Y6L status: present
Disk in slot 3: 500 GB SEAGATE ST3500620SS, rev:0003 serial:9QM97Y6P status: present
Disk in slot 4: 0 GB SEAGATE ST3500620SS, rev:0003 serial:9QMBWVSF status: present (failing)
Disk in slot 5: empty
Sysinfo Example B - SCSI Hardware Errors
Sysinfo Example C - SCSI Software Errors
• Event Log Example
2013-07-26 23:30:58-00:00UTC "Health Monitor (WARNING): Disk 4 Status is 'present (failing)'"
NOTE: The messages that SGOS reports regarding SMART triggers are not persistent across reboots.
Consider the case if SMART were triggered by the average temperature of the HDD being too high and SGOS reported the warning "status: present (failing)". This problem could be corrected by checking such things as the surrounding environment, that there is adequate ventilation and/or are that all the Proxy's fans are fully functioning. After a reboot of the Proxy when SGOS polls the HDD it would no longer receive the SMART trigger to report, as the temperature would now be within an acceptable range. Please be aware that it is not possible to get additional details regarding which specific attribute/s caused the SMART trip from within SGOS at this time.
Before Symantec can RMA a Hard Drive we require certain checks to be carried out. These are required to make sure that the drive that is faulty and not the slot/backplane of the proxy itself.
========================== Before Swap ==================================
Symantec Systems, Inc., ProxySG Appliance Version Information
Version: SGOS 6.5.4.4
Release id: 143614
UI Version: 6.5.4.4 (143614)
__________________________________________________________________________
Model: 9000-20B
Storage: 8 drives
Disk in slot 1: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A05IFM16 status: present
Disk in slot 2: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A091FRC5 status: present
Disk in slot 3: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A084FRC5 status: present
Disk in slot 4: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A02GFRC5 status: present
Disk in slot 5: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A04XFM16 status: present
Disk in slot 6: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1270A010FM16 status: present
Disk in slot 7: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A048FRC5 status: present
Disk in slot 8: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A096FRC5 status: bad
========================== After Swap ==================================
Symantec, Inc., ProxySG Appliance System Information
Version: SGOS 6.5.4.4
Release id: 143614
UI Version: 6.5.4.4 (143614)
__________________________________________________________________________
Model: 9000-20B
Storage: 8 drives
Disk in slot 1: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A05IFM16 status: present
Disk in slot 2: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A091FRC5 status: present
Disk in slot 3: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A084FRC5 status: present
Disk in slot 4: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A02GFRC5 status: present
Disk in slot 5: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A04XFM16 status: present
Disk in slot 6: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1270A010FM16 status: present
Disk in slot 7: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1250A096FRC5 status: bad
Disk in slot 8: 1000 GB TOSHIBA MK1001TRKB , rev:0106 serial:1260A048FRC5 status: offline
For RMAs that require the return of defective hardware and if the hardware is not returned in a timely manner you will be contacted. Customers may also contact Broadcom for return instructions at [email protected]