vSAN Cluster Restart fails at "Re-enable HA on this cluster" step with message "Wait for HA enable failed"
search cancel

vSAN Cluster Restart fails at "Re-enable HA on this cluster" step with message "Wait for HA enable failed"

book

Article ID: 430168

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

During the vSAN Cluster Restart workflow, the process may stall at the "Wait for HA enable" phase. In this state, the vSAN cluster is unable to reform, and vSAN objects remain Inaccessible despite all hosts being powered on and the vCenter being available.

Symptoms:

  • The vSAN restart workflow hangs or fails during the HA enablement step.

  • vSAN cluster membership is incomplete.

  • vSphere HA reports hosts as Isolated or in an Election state.

  • fdm.log contains entries such as Host isolated is true.

    • 2026-02-19T08:05:10.076Z In(166) Fdm[######]: [Originator@6876 sub=Policy opID=clusterManager.cpp:983-#####] Host isolated is true

Environment

VMware vSAN  8.x

Cause

The primary cause is a vSAN Network Partition occurring during the cluster power-on sequence. Because vSphere HA (when used with vSAN) relies on the vSAN VMkernel network for heart beating, any network-layer communication failure prevents the HA agents from reaching a consensus.

This is compounded by the vSAN Shutdown Wizard, which sets specific safety flags (DOMPauseAllCCPs and IgnoreClusterMemberListUpdates) to 1 to protect data during the shutdown. If a network partition prevents the automated workflow from completing, these flags remain active, preventing vSAN from reporting a "Ready" state to the HA agent, which is a mandatory prerequisite for HA to exit the startup phase.

Resolution

  • Resolve the vSAN Network Partition

    • Identify and resolve the underlying cause of the network partition. This may involve troubleshooting physical switches, virtual switches, or VMkernel interface configurations to ensure all hosts can communicate over the vSAN network.

  • Verify vSAN Cluster Membership

    • Log in to each ESXi host via SSH and check the current cluster state using following command.

      esxcli vsan cluster get
 
    • Confirm if the Local Node State is correct (Master/Backup/Agent).

    • Verify if the Member Count matches the total number of hosts in the cluster.

  • Clear vSAN advanced flags

    • If the network is confirmed functional but the workflow remains stuck or objects are inaccessible, manually reset the safety flags on all ESXi hosts in the cluster:

      esxcfg-advcfg -d /VSAN/DOMPauseAllCCPs
      esxcfg-advcfg -d /VSAN/IgnoreClusterMemberListUpdates
  • Restart vSphere HA

Once the network partition is resolved and the flags are cleared, the HA election should proceed. If it remains stuck:

    1. Navigate to the Cluster > Configure > vSphere Availability.

    2. Turn vSphere HA OFF.

    3. Wait for the "Unconfigure HA" tasks to complete on all hosts.

    4. Turn vSphere HA back ON.

Additional Information