To troubleshoot this issue, ensure that heartbeat communications from the host to vCenter are functioning correctly and are being received by the vCenter Server.
The default port for this communication is UDP 902, but be sure to verify the configured port in the vpxa.cfg
file on the host. This file also defines the IP address, which manages the host.
Confirm the vCenter Server managed IP address continuity throughout the environment.
ManagedIP
row.ipconfig
ifconfig
vpxa.cfg
file for the heartbeat traffic port by running the command:grep -i serverport /etc/vmware/vpxa/vpxa.cfg
configstorecli config current get -c esx -g services -k vpxa_solution_user_config |grep -i server_port
vpxa.cfg
file for the managed IP address by running the command:grep -i serverIp /etc/vmware/vpxa/vpxa.cfg
configstorecli config current get -c esx -g services -k vpxa_solution_user_config |grep -i server_ip
Test connectivity between vCenter Server and the ESXi host through the heartbeat network.
Because the packets are sent to a UDP port we cannot check port connectivity using netcat
because the test with a UDP flag ("-u") will always succeed.
Therefore we can determine if the vCenter Server is getting the heartbeat packets by running a capture on the vCenter Server itself.
To do so, open an SSH session to the vCenter, or otherwise connect to the appliance using a remote console, type "shell" to launch the Bash Shell prompt and run the below command:tcpdump src xxx.xxx.xxx.xxx and udp port 902 -nn
where xxx.xxx.xxx.xxx
is the management IP of the host that is disconnecting.
As the packets are sent only once every 10 seconds, please make sure to let the above capture run for at least 10 seconds to determine if they are being received correctly. Once the desired information has been gathered, the capture can be killed using "Ctrl+c".
Note: Heartbeats are only sent in the direction of host to vCenter over UDP port 902; checking connectivity from host to vCenter over TCP 902 using netcat or similar command is expected to fail, as this is port is not needed for connectivity (though vCenter to host over TCP 902 is).
Test network congestion: