ESXI Host in a vSAN cluster takes a long time entering maintenance mode
search cancel

ESXI Host in a vSAN cluster takes a long time entering maintenance mode

book

Article ID: 391522

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article explains the issue of an ESXi host within in a vSAN cluster takes a long time to enter Maintenance Mode  ( Default Mode - Ensure Accessibility ) 

  • An ESXi host placed into "maintenance mode" may show "in progress"  and takes a long time to finish the operation
  • vSAN Health Service may show all the objects are healthy and no other issue in the vSAN cluster. 

Environment

VMware ESXi (All Versions)

Cause

  • Some of the VMs are configured with VM Storage policy which has been set with "No data redundancy" 

  • The time taken for a host to enter Maintenance Mode in a vSAN cluster, where objects are configured with an FTT=0 policy (residing on the host), can be significantly longer depending on the size of the objects. Larger objects may require more time to migrate to other hosts in the cluster before the Maintenance Mode operation can be completed.

  • To identify the VM(s) configured with "No data redundancy" 

    • The VM Storage policy of the VM can be found  in Host and Cluster view  > Select VM  > Configure > Policies  > VM Storage Policy

    • The attributes of the respective VM Storage Policy can be found in : vCenter UI client > vSphere Client > Under Monitoring menu > VM Storage Policies > Select the Policy > Select Rules from right bottom window 

 

 

      • If VM(s) are configured with a Storage Policy that has rule  "No data Redundancy"  ( also called as FTT=0 i.e. Failure to tolerate 0 ), means VMs are running on single copy of data 

      • The vSAN objects of the VM(s) configured with above policy will have a single component created on he vSAN datastore shown as RAID 0 
         
      • VM(s) with FTT=0 :  Host and Cluster view  > select VM  > Monitor > Physical disk placement - see the image below



      • The name of VM Storage policy can be found from : Host and Cluster view  > Select VM  > Configure > policies  > VM Storage Policy
      • By default the Maintenance Mode is set with "Ensure Accessibility" which will ensure all the vSAN objects will be migrated to other hosts within the cluster only after which the Maintenance Mode operation will be marked as completed.

Resolution

  • The VM Storage Policy with "No data redundancy" ( FTT=0) should be to changed to  "Failures to tolerate : 1 failure or 2 Failure"  ( FTT=1 )  ( depends on end users choice and the amount of fault domains in the cluster ) 

    • The change of policy should not be applied on all the VMs in the cluster as this will trigger a cluster wide resynch and should be done only in batches with close monitoring on resynch 

  • The VM can be migrated off the vSAN storage or VM(s) can be deleted  ( if not needed )

Note: A detailed overview of the VM Storage Policies is provided in an article vSAN Policies ( refer section - Storage Policy - Availability )