VM Provisioning Fails on newly added host(s) to a vSAN Stretched Cluster Due to Witness Traffic Separation(WTS) Configuration Mismatch
search cancel

VM Provisioning Fails on newly added host(s) to a vSAN Stretched Cluster Due to Witness Traffic Separation(WTS) Configuration Mismatch

book

Article ID: 436949

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • In a vSAN stretched cluster, virtual machine provisioning, migration, and cloning operations fail exclusively on newly added ESXi host(s).
  • Existing hosts in the cluster operate without issue.
  • vSAN Health Service alert for vSAN: Basic (Unicast) Connectivity Check shows pings failing to/from witness host to/from the newly added hosts.

     
  • Unicast table entries from the witness appliance shows inconsistent traffic types between the witness appliance and the data nodes.
    esxcli vsan cluster unicastagent list
    NodeUuid                              IsWitness  Supports Unicast  IP Address      Port  Iface Name  SubClusterUuid                        Traffic Type
    ------------------------------------  ---------  ----------------  -------------  -----  ----------  ------------------------------------  ------------
    68efe746-61b6-5a49-44fa-############          0              true  192.###.###.2  12321              527c0650-2456-9a39-7843-############  vsan
    68efe6ef-4a9c-0831-5a36-############          0              true  192.###.###.3  12321              527c0650-2456-9a39-7843-############  vsan
    68fcd639-bbaf-d57b-8e32-############          0              true  192.###.###.2  12321              527c0650-2456-9a39-7843-############  vsan
    68fcd7d4-c89f-3f83-006e-############          0              true  192.###.###.3  12321              527c0650-2456-9a39-7843-############  vsan
    68fcd5de-a8c4-a59b-8048-############          0              true  192.###.###.1  12321              527c0650-2456-9a39-7843-############  vsan
    68efdc73-095b-95d5-ae19-############          0              true  192.###.###.1  12321              527c0650-2456-9a39-7843-############  vsan
    69da7388-1a0f-ad23-7ac7-############          0              true  10.###.##.69   12321              527c0650-2456-9a39-7843-############  witness
    69cd2a37-70dd-fe59-4022-############          0              true  10.###.##.39   12321              527c0650-2456-9a39-7843-############  witness

Environment

VMware vSAN (All Versions)

Cause

This issue is caused by Witness Traffic Separation (WTS) misconfiguration among all hosts in the cluster. The newly added hosts are configured with WTS enabled, whereas the original hosts in the cluster do not utilize traffic separation. This disparity interrupts symmetric witness communication within the cluster, causing network isolation that prevents the object creation required for VM provisioning.

Resolution

To resolve this issue, standardize the Witness Traffic Separation (WTS) configuration across all data nodes in the vSAN stretched cluster.

  1. Determine the intended cluster architecture for witness traffic (WTS enabled across all nodes, or WTS disabled across all nodes).

  2. To disable WTS on the newly added hosts and align them with the original hosts, remove the witness traffic tag from the configured VMkernel adapter by running the following command on the affected hosts: esxcli vsan network ipv4 remove -i <vmk_being_used_for_witness>

    Note: As of 9.x this can be done via the vCenter Web Client, see 9.x doc Configure Network Interface for Witness Traffic

  3. Alternatively, if WTS is the desired state, configure the original cluster hosts to enable Witness Traffic Separation to match the newly added hosts.

    8.x Configure Network Interface for Witness Traffic
    9.x Configure Network Interface for Witness Traffic

  4. Once the configuration is uniform, verify that the vSAN unicast tables have automatically updated to reflect the corrected, consistent paths via the below command:

    esxcli vsan cluster unicastagent list

  5. In vSAN Health Service, confirm the vSAN network ping tests report as healthy, passing state.

  6. Retest virtual machine creation, migration, or cloning operations on the previously affected host(s).