This article discusses troubleshooting VMware High Availability (HA) issues in VMware vCenter Server.
HA agent state failure
Symptoms:
vSphere HA provides high availability for virtual machines by pooling the virtual machines and the hosts they reside on into a cluster. Hosts in the cluster are monitored and in the event of a failure, the virtual machines on a failed host are restarted on alternate hosts.
When you create a vSphere HA cluster, a single host is automatically elected as the primary host. The primary host communicates with vCenter Server and monitors the state of all protected virtual machines and of the secondary hosts. Different types of host failures are possible, and the primary host must detect and appropriately deal with the failure. The primary host must distinguish between a failed host and one that is in a network partition or that has become network isolated. The primary host uses network and datastore heartbeating to determine the type of failure.
In order to troubleshoot HA, we can check the following to find the issue:
For more information about HA in vCenter Server 7.x, see How vSphere HA Works.
FDM configuration can fail if ESX hosts are connected to switches with automatic anti-DOS features.
FDM does support Jumbo Frames, but the MTU setting has to be consistent from end to end on every device.
Some firewall devices block ICMP pings that have an ID of zero. In such cases, FDM could report that some or all secondary hosts cannot ping each other, and/or that the isolation addresses cannot be reached.
Troubleshooting issues with FDM:
If the issue persists, file a support request with VMware Support and quote this Knowledge Base article ID (318936) in the problem description.