Restoring a vCenter (VC) appliance from a backup without impacting the vSAN cluster
search cancel

Restoring a vCenter (VC) appliance from a backup without impacting the vSAN cluster

book

Article ID: 392235

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When the vCenter (VC) has a different understanding of the ESXi host UUIDs in the cluster, there can be problems with the unicast table that vSAN uses to communicate with other nodes.  

Environment

VMware vSAN (All Versions)

Cause

From the time a vCenter (VC) appliance was backed up, hosts might have been removed and readded to the vSAN cluster.  If any host failures, sometimes an ESXi host might be reimaged to get it back online.  In these situations, a backup-restored VC may have incorrect host UUIDs than what may currently exist in the vSAN Unicast Table. This can interrupt vSAN communication if vCenter pushes an inaccurate unicast table to the cluster.

Resolution

Setting the IgnoreClusterMemberListupdates to 1 will keep the VC from updating the Unicast Table.  This way when the VC is restored there are no changes to the Unicast Table that can disrupt the vSAN communication.  The IgnoreClusterMemberListupdates value can be set back to 0 once vCenter and ESXi are back in sync after the VC backup restore has been completed. Make sure to check the object health and cluster connectivity via the vSAN Skyline Health or from the CLI.

Setting the IgnoreClusterMemberListupdates value (authoritative meaning whether the VC can update the Unicast Table or not):
    esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListupdates  <<<  sets VC to non-authoritative
    esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListupdates  <<<  sets VC to authoritative

Check the status of the vSAN cluster health prior to proceeding to the next host from the VC vSAN Skyline Health or from the CLI as per below:
    esxcli vsan debug object health summary get  <<<  determine the overall object health
    esxcli vsan cluster get  <<<  verify if the sub-cluster member count is the number of nodes that it should be