Network partition may occur when adding a vSAN 6.5 host to a vSAN 6.6 cluster with on-disk format v3 disks
search cancel

Network partition may occur when adding a vSAN 6.5 host to a vSAN 6.6 cluster with on-disk format v3 disks

book

Article ID: 326613

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:
vSAN cluster may become partitioned in the following situation:
  • vCenter Server resides on the vSAN datastore.
  • The vSAN cluster is configured as vSAN 6.6 with on-disk format v3 disks.
  • Unicast mode is enabled on the vSAN cluster.
  • An ESXi host with vSAN 6.5 installed (which doesn't support unicast mode) is added to the vSAN cluster.


Environment

VMware vSAN 6.6.x

Cause

This is expected behavior (limitation).When non-supported unicast hosts are added to vSAN cluster, unicast mode may change to multicast mode in some or all nodes in the vSAN cluster.
Under certain circumstances, this issue can occur:
  1. Add ESXi hosts to vSAN cluster.
  2. vCenter Server sequentially pushed changed mode to each host.
  3. ESXi host is temporarily partitioned for mixed mode (unicast + multicast).
  4. vCenter Server may get into an unknown state or its related vSAN objects may become inaccessible due to the cluster partition.
  5. vCenter Server can hang/become inaccessible due to this and it cannot push multicast state to other hosts.
  6. vSAN cluster will be partitioned for mix mode (unicast + multicast).

Resolution

You can avoid this issue by planning following action.
  • Migrate the vCenter that manages the cluster to a non-vSAN datastore so that a cluster partition does not cause it to become unavailable.
  • Update ESXi hosts to vSAN 6.6 or later (which supports unicast) before adding them to the vSAN cluster.


Workaround:

If this issue occurs, you can use esxcli vsan cluster unicastagent remove command to remove unicast entries referencing the node that was added to the cluster.

esxcli vsan cluster unicastagent remove –a <IP> -u <LOCAL NODE UUID>

You can check unicastagent list to use esxcli vsan cluster unicastagent list command.

Example
This is 4 node vsan cluster and added another hosts  Node Uuid is 5aa12a01-eeee-eeee-eeee-00109b37ee4a

[root@esx-01:~]esxcli vsan cluster unicastagent list

NodeUuid                              IsWitness  Supports Unicast  IP Address    Port   Iface Name
--------------------------------------------------------------------------------------------------
596f1214-bbbb-bbbb-bbbb-00109b165b8e          0  true              192.168.XXX.2  12321
596f115a-cccc-cccc-cccc-00109b16566e          0  true              192.168.XXX.3  12321
596f1189-dddd-dddd-dddd-00109b165b6e          0  true              192.168.XXX.4  12321
5aa12a01-eeee-eeee-eeee-00109b37ee4a          0  false             192.168.XXX.5  12321   #### added host

[root@esx-02:~]esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address    Port   Iface Name
--------------------------------------------------------------------------------------------------
596f1189-dddd-dddd-dddd-00109b165b6e          0  true              192.168.XXX.4  12321
596f115a-cccc-cccc-cccc-00109b16566e          0  true              192.168.XXX.3  12321
591b8c87-aaaa-aaaa-aaaa-00109b165b2e          0  true              192.168.XXX.1  12321

[root@esx-03:~]esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address    Port   Iface Name
--------------------------------------------------------------------------------------------------
591b8c87-aaaa-aaaa-aaaa-00109b165b2e          0  true              192.168.XXX.1  12321
596f1214-bbbb-bbbb-bbbb-00109b165b8e          0  true              192.168.XXX.2  12321
596f1189-dddd-dddd-dddd-00109b165b6e          0  true              192.168.XXX.4  12321

[root@esx-04:~]esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address    Port   Iface Name
--------------------------------------------------------------------------------------------------
596f115a-cccc-cccc-cccc-00109b16566e          0  true              192.168.XXX.3  12321
596f1214-bbbb-bbbb-bbbb-00109b165b8e          0  true              192.168.XXX.2  12321
591b8c87-aaaa-aaaa-aaaa-00109b165b2e          0  true              192.168.XXX.1  12321


Execute following command

esxcli vsan cluster unicastagent remove –a 192.168.XXX.5 -u 5aa12a01-eeee-eeee-eeee-00109b37ee4a 


Additional Information

Impact/Risks:
If this issue occurs, some vSAN objects may become inaccessible due to loss of quorum or availability of data-replicas.