1. Network partitioned host reports sub cluster member count as 1.
[root@######~] esxcli
vsan cluster get
Cluster Information
Enabled:
true
Current
Local Time: 2025-02-07T00:13:45Z
Local Node
UUID: ########-####-####-####-########
Local
Node Type: NORMAL
Local
Node State: MASTER
Local
Node Health State: HEALTHY
Sub-Cluster
Master UUID: ########-####-####-####-########
Sub-Cluster Backup UUID:
Sub-Cluster
UUID: ########-####-####-####-########
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member Count: 1
Sub-Cluster
Member UUIDs: ########-####-####-####-########
Sub-Cluster
Member HostNames: Hostname###
Sub-Cluster
Membership UUID: ########-####-####-####-########
Unicast
Mode Enabled: true
Maintenance Mode State: OFF
[root@######~]
esxcli vsan debug object health summary get
Health Status
Number Of Objects
---------------------------------------------------------
-----------------
remoteAccessible
0
inaccessible
0
reduced-availability-with-no-rebuild
583
reduced-availability-with-no-rebuild-delay-timer
0
[root@######~]
esxcli vsan health cluster list
Health Test
Name
Status
--------------------------------------------------
------
Overall health
findings
red (Network misconfiguration)
Network
red
vSAN cluster
partition
red
Cluster
yellow
[root@######:~] vmkping -I vmk1 **.**.**.38 -s 1472
PING **.**.**.38 (1**.**.**.38): 1472 data bytes
1480 bytes from **.**.**.38: icmp_seq=0 ttl=64 time=0.507 ms
1480 bytes from **.**.**.38: icmp_seq=1 ttl=64 time=0.371 ms
1480 bytes from **.**.**.38: icmp_seq=2 ttl=64 time=0.561 ms
--- **.**.**.38
ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.371/0.480/0.561 ms
VMware vSAN 7.0.x
VMware vSAN 8.0.x
To check the status of VMNIC used for vSAN traffic with current driver and firmware version.
esxcli network nic stats get –n <vmnic interface>
Driver: bnxtnet
Firmware Version: 226.0.145.0 /pkg 226.1.107000
Version: 231.0.153.0
1.Validate the current device driver version and this should be compatible to the firmware which needs to be upgraded.
2.Upgrade the VMNIC firmware to latest version which should be compatible as per hardware vendor and the same should be validated in. Broadcom compatibility matrix Broadcom Compatibility Guide .
As per Broadcom compatibility guide you may see that for the current device driver version (bnxtnet) 231.0.153.0, the Supported firmware is 231.1.162001
1.Other hosts may see all the cluster members until the impacted host is placed in maintenance mode.
2.Rebooting the host or placing the host and exiting from maintenance may not help.
3.In the environment where we have standby vmnic configured, Network checks are always recommended by placing the active vmnic down and validate if the standby vmnic takes over to check the failover settings. ESXTOP can be used with option n to check the network traffic stats. This is done to isolate the network related issues.
4.Once you start an upgrade of a vSAN cluster make sure to complete the upgrade ASAP preferably within a week's time as mixed versions of ESXi in the same cluster, especially a difference of major releases, is not a supported configuration and can cause issues such as performance issues and cluster instability. This is due to having mixed codes talking to each other within the same cluster. Mixed versions are ONLY supported during an upgrade which is expected to be completed typically within a 24-48hr period for clusters below 32 hosts. For large clusters, 32-64 hosts typical upgrade should be completed within 48-72hrs.
5.You may proceed with ESXi upgrade on other nodes as the partition issues are expected behavior in 3 node vSAN cluster but it is always better to have the device driver and firmware versions compatible before upgrade to isolate the partition issue if further investigations are required.
vmkchdev -l | grep -i vmnic