Configuring vSAN Unicast networking from the command line
search cancel

Configuring vSAN Unicast networking from the command line

book

Article ID: 326427

calendar_today

Updated On: 02-14-2025

Products

VMware vSAN

Issue/Introduction

This article provides detailed, step-by-step guidance for manually configuring vSAN Unicast Networking using esxcli.

Important Note:
The vSAN Unicast Mode feature was introduced in vSAN 6.6 (vSphere 6.5.0d), which has since reached End of Sales and Support (EOSS).
Unicast networking replaced the legacy multicast networking to simplify the network requirements for maintaining a vSAN cluster's quorum.

Key Scenarios:

  1. Upgrade Issues:
    During upgrades to vSphere, some vSAN cluster hosts may fail to inherit the cluster member updates from vCenter.

  2. Node Discrepancies:
    In cases where vCenter is unavailable during node additions or removals, the list of vSAN nodes in vCenter may not align with the actual cluster configuration.

Impact and Risks:

  • Network Partitioning:
    If the Unicast agent list is not rebuilt correctly, network partitioning can occur. This often results in the vSAN cluster forming multiple partitions.
    Objective: Rebuild the Unicast agent list to re-establish a unified vSAN cluster.

Critical Reminder:
When configuring the Unicast agent list for a specific ESXi host:

  • Add entries for all other hosts in the cluster.
  • Do not add the IP address of the host being configured.

Including the host's own IP in the Unicast agent list can lead to severe networking issues, including the potential for a Purple Screen of Death (PSOD).

Environment

VMware vSAN 8.x
VMware vSAN 7.x
VMware vSAN 6.x

Cause

This issue can arise due to several reasons, including:

  1. Improper Node Removal:
    Adding an ESXi node that was previously part of another vSAN cluster but was not removed correctly.

    • For example, the node may still contain residual configurations such as old disk partitions or outdated unicast entries.
  2. Recovery from vCenter Outage:
    In the past, recovery actions may have been performed due to a vCenter outage, involving modifications to vSAN cluster settings to restore functionality.

  3. Post-Recovery Misconfigurations:
    After completing the recovery process, certain changes may not have been reverted to their default settings, leading to inconsistencies within the cluster.

Resolution

Follow these steps to manually rebuild the Unicast Agent List on each host:

1. Gather Required vSAN Cluster Information

  • Identify all nodes within the vSAN cluster, including their management IP addresses.
  • Verify the current cluster configuration to ensure all active and valid nodes are accounted for.

2. Rebuild the vSAN Unicast Agent List

  • Use the esxcli command to configure the Unicast agent list on each host.
  • Add entries for all other vSAN cluster nodes, ensuring that:
    • Each entry corresponds to a valid, active node.
    • The host's own IP address is not included in its Unicast agent list.

3. Finalize the Configuration

  • Validate the updated Unicast agent list for accuracy.
  • Confirm cluster health and connectivity to ensure the changes have been successfully applied.

1. Identify the VMkernel port used for vSAN, its IP address, and the node UUID on each host in the cluster.
 

1.1 SSH to every ESXi host in the cluster and login as root.

1.2 Run the following command to identify the VMkernel port used for vSAN, and copy the output for later use: 

esxcli vsan network list

[root@server name:~] esxcli vsan network list
Interface
   VmkNic Name: vmk1
   IP Protocol: IP
   Interface UUID: ########-####-####-####-############
   Agent Group Multicast Address: 224.2.3.4
   Agent Group IPv6 Multicast Address: ff19::2:3:4
   Agent Group Multicast Port: 23451
   Master Group Multicast Address: 224.1.2.3
   Master Group IPv6 Multicast Address: ff19::1:2:3
   Master Group Multicast Port: 12345
   Host Unicast Channel Bound Port: 12321
   Data-in-Transit Encryption Key Exchange Port: 0
   Multicast TTL: 5
   Traffic Type: vsan

            Note: Take note of the VmkNic Name - in the above output it's "vmk1"

1.3 Find the IP address for "vmk1" with the following command:


esxcli network ip interface ipv4 get | grep vmk1
 

[root@server name:~] esxcli network ip interface ipv4 get | grep vmk1
vmk1  ###.##.##.##  ###.##.###.###  ###.##.###.###  STATIC        ###.##.###.###    false


Note: The IP address of vmk1 on this host is: ###.##.##.##

 

1.4 Find the node UUID of the host with the following command:


cmmds-tool whoami

[root@server name:~] cmmds-tool whoami
########-####-####-####-############

 

1.5 Equipped with host UUID and vSAN VMkernel port IP address for ALL hosts in the cluster, start building the Unicast agent lists.
 

                  3 Node cluster example:

server name | UUID: ########-####-####-####-############ | vSAN IP: ###.##.##.##
server name | UUID: ########-####-####-####-############ | vSAN IP: ###.##.##.##
server name | UUID: ########-####-####-####-############ | vSAN IP: ###.##.##.##



2. Building the Unicast Agent List
 

2.1 Before making changes to the Unicast agent lists, run the following command on all nodes in the cluster to temporarily ignore "Cluster Member List Updates" coming from vCenter.

esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListupdates

[root@server name:~]  esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListupdates
Value of IgnoreClusterMemberListUpdates is 1

2.2 Sometimes, there might be incorrect entries in a host's Unicast agent list such as:

    • "Supports Unicast" is set to False for a particular host, an incorrect IP address, an incorrect Node UUID, or a host which is NOT a Stretched Cluster Witness host is incorrectly marked with "IsWitness" as True.
    • An existing entry cannot be modified. It must be deleted and re-added.
 2.2.1 Check the current unicast agent list with the following command:

esxcli vsan cluster unicastagent list

[root@server name:~] esxcli vsan cluster unicastagent list
NodeUuid                              IsWitness  Supports Unicast  IP Address    
------------------------------------  ---------  ----------------  ------------- 
########-####-####-####-############          0             false  ###.##.###.##
########-####-####-####-############          0             false  ###.##.###.##
 

Note: In the output above, the second entry has an incorrect IP address, and both entries have the "Supports Unicast" flag set as "false".

Note: For environments using Data-in-transit encryption, take note of the thumbprint as it appears here, as it must be added for in-transit encryption.
          To find the cert thumbprint on the host, SSH into the host itself and run the following command: openssl x509 -in /etc/vmware/ssl/rui.crt -fingerprint -sha256 -noout

[root@esxi~] openssl x509 -in /etc/vmware/ssl/rui.crt -fingerprint -sha256 -noout
sha256 Fingerprint=##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
 

2.2.2 To fix this problem, run the following command to delete the errant entries. You can also use this to delete all the entries and rebuild the unicast agent list from scratch.
 

Syntax: esxcli vsan cluster unicastagent remove -a <Host_VSAN_IP>

[root@server name:~] esxcli vsan cluster unicastagent remove -a ###.##.###.##

 Note: By running esxcli vsan cluster unicastagent list we can verify if the entry is cleared. 

2.3 Add entries to the unicast agent list.

    • **REMINDER** When building the Unicast agent list on an ESXi host, add entries for all the other hosts but never add the IP of the host whose table is being configured.

    • When an ESXi host has its own IP address in its Unicast agent list, vSAN can go unstable, networking problems can arise and potentially lead to the host encountering a PSOD.

    • Example : Using the 3 node example, each host will have 2 entries. 

Data Node entry syntax:

esxcli vsan cluster unicastagent add -t node -u <Host_UUID> -U true -a <Host_VSAN_IP> -p 12321 -T <Host Cert Thumbprint> 

Example on host 1:

[root@server name1:~]esxcli vsan cluster unicastagent add -t node -u ########-####-####-####-############ -U true -a ###.##.###.## -p 12321 -T ##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##

[root@server name1:~] esxcli vsan cluster unicastagent list
NodeUuid                            IsWitness  Supports Unicast  IP Address    Port  Iface Name  Cert Thumbprint                                                                                       SubClusterUuid
------------------------------------  ---------  ----------------  --------------  -----  ----------  -----------------------------------------------------------------------------------------------  --------------
########-####-####-####-############   0                 true   ###.##.###.##  12321                   ##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##  ########-####-####-####-############

Note: After running the commands, check the Unicast agent list to confirm entries were added correctly as shown in the output above.

Note: in a Stretched Cluster, you must set the "IsWitness" flag to "True" for the Witness host entry.

                  Note: For environments using Data-In-Transit encryption, the necessary Cert Thumbprint must be added with the -T switch as part of the command string.

Witness Node entry syntax: 

esxcli vsan cluster unicastagent add -t witness -u <Host_UUID> -U true -a <Host_VSAN_IP> -p 12321

2.4 Repeat from step 2.1 on the remaining vSAN hosts, making sure to not include the IP of the host whose table is being configured.

3. Last Step.

3.1 Assuming there are no issues with the physical network, the vSAN Cluster should form right away.
If the rebuild on each host was done correctly, the correct number of Cluster members should be shown in the following command,

[root@server name:~] esxcli vsan cluster get
Cluster Information
   Enabled: true
   Current Local Time: 2024-08-16T06:22:52Z
   Local Node UUID: ########-####-####-####-############
   Local Node Type: NORMAL
   Local Node State: BACKUP
   Local Node Health State: HEALTHY
   Sub-Cluster Master UUID: ########-####-####-####-############
   Sub-Cluster Backup UUID: ########-####-####-####-############
   Sub-Cluster UUID: ########-####-####-####-############
   Sub-Cluster Membership Entry Revision: 37
   Sub-Cluster Member Count: 2  ←--------------
   Sub-Cluster Member UUIDs: ########-####-####-####-############, ########-####-####-####-############
   Sub-Cluster Member HostNames: server name1, server name2
   Sub-Cluster Membership UUID: ########-####-####-####-############
   Unicast Mode Enabled: true
   Maintenance Mode State: OFF
   Config Generation: ########-####-####-####-############ 7 2024-08-16T06:07:52.230
   Mode: REGULAR
 

3.2 After all the process has been completed successfully, make sure to enable the "Cluster Member List Updates" again on all the nodes in the cluster. Run the following command:
 

esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListupdates

[root@server name:~]  esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListupdates
Value of IgnoreClusterMemberListUpdates is 0