Failed hard drive replacement on Netapp E5660 storage array
search cancel

Failed hard drive replacement on Netapp E5660 storage array

book

Article ID: 252245

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

This article instructs how to replace a hard disk in a Netapp storage array.  Diagnosing the problem is covered in other articles.

Finding the errors are in the Additional Information section.

Cause

A drive may have failed and the new one has arrived to be swapped in. 

Resolution

Identifying and removing the Drive:

  • Put on antistatic protection.
  • Unpack the drive.
    1. Set the new drive on a flat, static-free surface near the tray.
    2. Save all packing materials in case you need to return the drive.
  • Remove the bezel from the front of the tray.
  • Drawer 1 is the top drawer and Drawer 5 is the bottom.
  • Locate the failed drive by checking the Drive Drawer Service Action Required LEDs on the front of the tray.

    LED color Result
    Amber Fault is detected.
    Blue You can safely remove the drive.

    Attention: Possible damage to drives – After the blue Drive Drawer Service Action Allowed LED comes on, wait 30 approximately seconds before you open the drive drawer. Waiting 30 seconds allows the drive to spin down, which prevents possible damage to the hardware. To prevent possible damage to the other spinning drives in the drive drawer, open the drive drawer slowly.
  • Release the levers on each side of the drive drawer by pulling both towards the center.
    Note: If you are replacing drives in separate drawers, you can extend only one drawer at a time.


    1 Release levers
  • Carefully pull on the extended drive drawer levers to pull out the drive drawer to its full extension without removing it from the tray.
  • Remove the failed drive from the drive drawer.
    1. Locate the drive release lever that secures the drive handle in place. Disengage the drive release lever by carefully pulling it back to release the drive handle.
    2. Raise the drive handle to vertical.
    3. Lift the drive from the drive drawer by using the drive handle.

    4. Put the drive on a flat, static-free surface.
      Note: If the fault is with the drive drawer and not the drive, the drive can be re-used.
      Attention: If you accidentally remove an incorrect drive, wait at least 30 seconds, and then reinstall it. For the recovery procedure, refer to the storage management software.
  • If you are removing multiple drives, label each drive with the tray number, the drive drawer number, and the slot number.

Installing the new drive:

  • Wait 30 seconds for the storage management software to recognize that the drive has been removed.
  • Raise the drive handle on the new drive to the vertical position.
  • Align the two raised buttons on each side over the matching gap in the drive channel on the drive drawer.

  • Lower the drive straight down, and then rotate the drive handle down until the drive snaps into place under the drive release lever.
  • Push the drive drawer all the way into the drive tray, and close the levers on each side of the drive drawer.
    Attention: Possible equipment malfunction – Make sure that you push both drive drawer levers to each side so that the drive drawer is completely closed. The drive drawer must be completely closed to allow proper airflow and prevent overheating.
    Note: Depending on your configuration, the controller might automatically reconstruct data to the new drive. If the controller-drive tray uses hot spares, the controller might need to perform a complete reconstruction on the hot spare before the controller copies the data to the replaced drive. This reconstruction process increases the time that is required to complete this procedure.
  • Look at the Drive Drawer Service Action Required LED. Based on the LED status, perform one of these actions:

    LED status Result
    On The drive might not be installed correctly, or the new drive might be defective. Remove the drive, wait 60 seconds, and reinstall the drive. If it still fails, replace it with another new drive. Go to the next step when resolved.
    Off Go to the next step.

    Note:
    If the problem is not resolved, contact technical support.
  • Install the bezel on the front of the tray.
  • Verify the status of the new drive by running:
  • Determine the array name as root with: SMcli -d
  • Determine the health status of the array, all drives, and all volumes with: SMcli -n Your_Array_Name_Here -c 'show storagearray healthstatus; show alldrives summary; show allvolumes;'
  • Bring the new drive online by using the storage management software:
    1. In the Array Management Window, select the affected volume group, and then select the menu option Replace Drives.
    2. Select the replaced drive that corresponds to the slot location or select an appropriate replacement drive.
    3. Click the Replace Drive button.
      When the drive reconstruction completes, the volume group is in an Optimal state.
  • Check the status of all of the trays in the storage array.
  • If any component has a "Needs Attention" status, Click the Recovery Guru toolbar button in the Array Management Window, and complete the recovery procedure. If the problem is not resolved, contact technical support.

To monitor and verify the logical disk rebuild onto the new drive, run

SMcli -d  and get the array name.

SMcli -n your_array_name_here -c 'show storageArray longRunningOperations;'

SMcli -n your_array_name_here -c 'show storagearray healthstatus;'

Additional Information

The failure can be found in the netapp supportdata file contained in the CSR.

Or execute:

  1. netapp_csr.py /home/csr/netapp_support_bundle
  2. Copy the netapp_support_bundle produced to your desktop
  3. Unzip the netapp_support_bundle.7z file
  4. Open the recovery_guru_procedures.html file in your browser
  5. Review each header for any issues or if there are no issues, it will also make that known.