Confirming Packet Continuity
- To confirm the ESXi host is sending heartbeat packets to the vCenter every 10 seconds, use the following command from an SSH session to the ESXi host.
esxi# tcpdump-uw dst host <vcenter_ip_address> and udp port 902
- To confirm if heartbeats are reaching vCenter over port UDP 902 every 10 seconds, use the following command from an SSH session to the vCenter Appliance.
esxi# tcpdump src host <esxi_host_ip_address> and udp port 902
Next steps
- If heartbeats are being sent by the ESXi host, but not reaching the vCenter, the network between the two machines needs to be investigated further for firewalls or other connection limiting mechanisms.
- If heartbeats are not being sent by the host, investigate the ESXi services and log files for a possible cause.
- If heartbeats are both being sent by ESXi and being received by vCenter, the problem is not related to a network block of heartbeat traffic. Investigate the vCenter Server for reasons why the hosts are intermittently disconnecting.
Note: We can also use the below methods to check the UDP port 902 connectivity between vCenter and ESXi host using the below commands:
On vCenter:
vcsa# tcpdump -ni eth0 host <esxi_host_ip_address> and udp port 902
The expected output should be a heartbeat packet from the ESXi host on port 902 received on VC every 10 seconds.
On ESXi host:
esxi# pktcap-uw --vmk vmk0 --dstudpport 902 --dir 0 -o - | tcpdump-uw -enr -
We can also check TCP 902 connectivity on the vCenter to ensure TCP port 902 connectivity is working fine. This is only for TCP connectivity check validation only.
vcsa# curl -v telnet://<esxi_host_ip_address>:902
Important: Please ensure the network between the vCenter and ESXi host UDP port 902 connectivity is set to bi-directional on the firewall.
Workaround
As a temporary workaround, increase the timeout limit in vCenter Server by editing or creating the Advanced Setting: config.vpxd.heartbeat.notRespondingTimeout
Note: Increasing the timeout is a short-term solution until any network issues can be resolved.
vSphere Client:
To increase the timeout limit to 120 seconds (vary as needed):
- Open the vSphere Client in a web browser and log in.
- Select the vCenter object from the inventory under Hosts and Clusters.
- Select the Manage or Configure tab.
- Select Settings > Advanced Settings.
- Click Edit.
- In the Key field, type:
config.vpxd.heartbeat.notRespondingTimeout
- In the Value field, type:
120
- Click Add.
- Click OK.
- Restart the vCenter Server service.
vcsa# service-control --stop vmware-vpxd && service-control --start vmware-vpxd