The ESXi host displays a critical cluster alert regarding its High Availability status. The vSphere Client flags the host with the following specific error message:
vSphere HA agent on host esx-### in cluster Cluster-### in Datacenter-## could not reach isolation address: ##.##.##.##
Running vmkping forced over the management vmkernel interface successfully reached the isolation gateway
vmkping -I vmk# <isolation address: ##.##.##.## >
TCP/UDP port 8182 was verified as open and listening across cluster peers using Netcat:
nc -zv <Peer_IP> 8182
VMware vSphere ESX 8.x
This issue stems from a transient network disruption on the Management Network interface (vmk0#).
When a vSphere HA host loses connectivity to both its master host and its designated isolation addresses, it triggers an isolation response election. In this scenario, even though the physical network connectivity was automatically restored, the FDM (Fault Domain Manager) daemon failed to clear its latch error state, causing the HA agent to remain in an error condition until it is manually forced to re-evaluate its status.
To clear the stale error state and force the FDM daemon to poll the network topology again, you must manually re-initialize the vSphere HA agent on the affected host.
Log in to the vSphere Client.
Navigate to the Hosts and Clusters inventory view.
Locate and right-click the affected host: esxi-###
Select Reconfigure for vSphere HA from the context menu.
Monitor the Recent Tasks pane at the bottom of the screen.
The task will unconfigure the HA agent, restart the FDM service, and re-introduce the host to the cluster election.
Once the task completes successfully, the isolation address alert will clear automatically.