Volumes will not remain on preferred controller even after replacement
search cancel

Volumes will not remain on preferred controller even after replacement

book

Article ID: 274207

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

The Security Analytics system UI repeatedly shows an error for failed disks. This is due to it not maintaining a connection on the preferred path. The error can be temporarily resolved with "SMcli -n array2 -c 'reset storageArray volumeDistribution;'", but it will fail over within a few hours again and not be on preferred path.

The lithium batteries may have reached a state where they need a deep discharge and recharge cycle performed. If the batteries are unusable, the cache cannot be maintained in the event of a power failure.  For performance reasons, the controller with the failed battery will not support I/O and the volumes will be served through the alternate controller.

 

Environment

Netapp E5660 storage arrays

Resolution

Netapp has recommended a procedure to refresh the batteries in both controllers on the storage array. There should be no impact to the data I/O and can be done on a live system.

Remove -> This is all done on Santricity, so no need to login as root. Step one is to redistribute the volumes, and then run the battery relearn commands. You will want to collect logs from before and after to verify the state of the battery

Procedure

Start the scheduled battery learn cycle via Santricity's "CLI(Command Line Interface):"

  • Open SANtricity
  • In the Enterprise Management window, right click the Array
  • Paste and run this command: 

set storageArray learnCycleDate daysToNextLearnCycle=0 time=HH:MM;

SMcli -n array0 -c 'set storageArray learnCycleDate daysToNextLearnCycle=0 time=HH:MM;'

(Where HH:MM is the current time plus 2 minutes, in 24 hour format)

  • Select Tools at the top of the script editor window
  • Select Verify and Execute
  •  Wait 2 minutes, then verify the learn cycle has started by viewing event in the Major Event Log. It will display the following message - Learn Cycle for battery started

Once this is done, the controllers needed to be rebooted -

  1. Steps
    1. Select Hardware.

    2. If the graphic shows the drives, click Show back of shelf.

      The graphic changes to show the controllers instead of the drives.

    3. Click the controller that you want to reset.  The controller’s context menu appears.

    4. Select Reset, and confirm that you want to perform the operation.

    5. SMcli -n array0 -c 'reset controller [a]
    6. Re-distribute the volumes and right away run the reboot from controller unused -
      1. In the Enterprise Management window, right click the Array
      2. Select Execute Script
      3. Type or copy & paste the command: 
      4. reset storageArray volumeDistribution;

-> SMcli -n array0 -c 'reset storageArray volumeDistribution;'

Once completed, wait around 10 minutes, and run reboot on controller "B" by running steps 1-5 above but changing a to be in step 4.

Volumes should remain on the preferred controller. Collect new logs and attach them to the case to allow support to investigate and verify.

 

Additional Information

Instructions from Netapp