Error "vSphere HA detected a possible host failure of this host" observed after ESXi or NIC firmware upgrade
book
Article ID: 390275
calendar_today
Updated On:
Products
VMware vSAN
Issue/Introduction
Symptoms:
It's a vSAN cluster.
One of the hosts upgraded through a cluster image using VMware Lifecycle Manager (VLCM).
Post remediation, the host reports HA-related errors as seen in the Summary tab: "vSphere HA host status" "vSphere HA detected a possible host failure of this host."
Additionally, the following errors are observed at the cluster level: "vSAN network alarm 'vSAN MTU check (ping with large packet size)'" "vSAN network alarm 'vSAN Basic (unicast) connectivity check'"
Reverting the host to an older ESXi version does not fix these errors.
The issue is also seen post-NIC firmware upgrade, rendering one of the active VMNICs down, causing vSAN cluster partitioning. This causes HA-related errors besides cluster partitioning.
vmkernel.log reports the following errors for the unicast network where data nodes fail to connect to each other over port 12321 on the vSAN network:
2025-06-03T06:43:37.903Z cpu76:2099736) CMMDSNet: CMMDSNetSendtoUnicastChannels: 1486: Throttled: 52d68c54-dddl-9a5c-0b6f-##########: Failed to send to unicast host '#.#.#.#:12321' on iface '#.#.#.#': Host is down. 2025-06-03T06:43:43.903Z cpu76:2099736) CMMDSNet: CMMDSNetSendtoUnicastChannels: 1486: Throttled: 52d68c54-dddl-9a5c-0b6f-##########: Failed to send to unicast host '#.#.#.#;12321' on iface '#.#.#.#': Host is down.
Environment
Vmware vSAN 7.x
Vmware vSAN 8.x
VMware vSAN 9.x
Cause
These errors are reported due to network partitioning between vSAN nodes.
The upgraded host fails to reach the other vSAN data node through the vSAN network.
Running the vmkping command on the upgraded host fails with the 'Host is down' error.
# vmkping -I vmkY x.x.x.x PING #.#.#.#(#.#.#.#): 56 data bytes sendto() failed (Host is down)
where is the VMkernel adapter of the upgraded host on which vSAN traffic is enabled?
where—is the IP of the vmkernel adapter of the other data node on which vSAN traffic is enabled
Resolution
Of the two physical NICs used for the vSAN-enabled vmkernel adapter, one is found to be faulty.
Placing the faulty physical NIC in the 'Unused' state using the following steps helps in re-establishing the network connectivity between the two vSAN data nodes.
In vSphere Client > Host gbmikvsan01 > Configure > Virtual Switches > Expand Switch vSwitch1 > Click on 'Manage Physical Adapters.' Place vmnicX under the 'Unused Adapter' list.
(Where vmnicX is the suspected physical adapter).
The second active physical NIC takes over once the faulty NIC is placed in the 'unused state.'
Validate ping from each of the data nodes over the vSAN vmkernel network to the other node using the command:
# vmkping -I vmkY #.#.#.#
where vmkY is the vmkernel adapter of the host on which vSAN traffic is enabled. where #.#.#.# is the IP of the vmkernel adapter of the other data node on which vSAN traffic is enabled.