vSAN objects remain inaccessible or cluster partitions occur after repurposing nodes from a shut down cluster
search cancel

vSAN objects remain inaccessible or cluster partitions occur after repurposing nodes from a shut down cluster

book

Article ID: 430973

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When ESXi hosts are moved or repurposed from a vSAN cluster that was previously shut down (via the vSAN Cluster Shutdown Workflow), the new cluster may experience the following symptoms:

  • Virtual machines appear as (inaccessible) in the vCenter inventory.

  • The vSAN health service reports Network Partition or Sub-cluster member count mismatch.

  • Hosts appear to be isolated in their own partitions despite valid physical network connectivity (VMkernel ping success).

  • Objects show Data unavailability in the Virtual Objects view.

Environment

 

  • Products: VMware Cloud Foundation (VCF), VMware vSAN.

  • Versions: 7.x, 8.x.

  • Scenario: Repurposing ESXi hosts without a full re-installation of the ESXi OS.

 

Cause

The vSAN Cluster Shutdown Workflow and manual shutdown procedures utilize advanced ESXi configuration parameters to quiesce the Distributed Object Manager (DOM) and stabilize the unicast membership list. If a host is removed from the cluster or repurposed while these settings are active, they persist in the host's configuration.

  • /VSAN/DomPauseAllCcps: If set to 1, it prevents the host from synchronizing metadata with the Cluster Control Plane.

  • /VSAN/IgnoreClusterMemberListupdates: If set to 1, it prevents the host from receiving new unicast agent updates, causing it to rely on stale membership data from its previous cluster.

Resolution

Before joining a repurposed node to a new vSAN cluster, perform the following "Sanity Check" to ensure shutdown-specific overrides are cleared.

  1. Establish SSH Connectivity: Log into the repurposed ESXi host(s) as root.

  2. Audit Advanced Settings: Check the status of the DOM Pause and Membership Update flags:

    Bash
     
    esxcfg-advcfg -g /VSAN/DomPauseAllCcps
    esxcfg-advcfg -g /VSAN/IgnoreClusterMemberListupdates
    
  3. Revert to Default Values: If either command returns a value of 1, reset them to the default operational value of 0:

    Bash
     
    esxcfg-advcfg -s 0 /VSAN/DomPauseAllCcps
    esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListupdates
    
  4. Verify vSAN Unicast Table: Clear any stale unicast entries remaining from the old cluster to allow vCenter to populate the new list:

    Bash
     
    esxcli vsan cluster unicastagent list
    

    Note: If stale IPs are present, the host may need to be removed and re-added to the new cluster after resetting the flags.

  5. Restart Management Agents (Optional): If the host does not immediately join the partition, restart the management agents to force a sync:

    Bash
     
    /etc/init.d/hostd restart
    /etc/init.d/vpxa restart

Additional Information

For more information on the specific behavior of these flags during a cluster restart, see KB 393357.