ESXi host disconnects intermittently from vCenter Server
book
Article ID: 318647
calendar_today
Updated On:
Products
VMware vCenter ServerVMware vSphere ESXi
Issue/Introduction
ESX/ESXi hosts disconnect frequently from vCenter Server.
vCenter Server shows ESX/ESXi host(s) as not responding
vCenter Server is not randomly receiving ESX/ESXi heartbeats
/var/log/vmware/vpxd/vpxd.log contains entries similar to:
[<YYYY-MM-DD>T<time> verbose 'App'] [VpxdIntHost] Missed 2 heartbeats for host esx.example.com
Environment
VMware vCenter Server 7.0.x VMware vCenter Server 8.0.x
VMware ESXi 7.x
VMware ESXi 8.x
Cause
This issue occurs when the UDP heartbeat message sent by ESX/ESXi host is not received by vCenter Server. if vCenter Server does not receive the UDP heartbeat message, it treats the host as not responding. ESX/ESXi host send heartbeats every 10 seconds and vCenter Server has a window of 60 seconds to receive the heartbeats. This behavior can be an indication of a congested network between the ESX/ESXi host and vCenter Server.
Note: If the host disconnects every 60 seconds there is likely a firewall blocking UDP 902 heartbeats from ESXi host to vCenter.
Resolution
Confirming Packet Continuity
To confirm the ESXi host is sending heartbeat packets to the vCenter every 10 seconds, use the following command from an SSH session to the ESXi host.
tcpdump-uw dst host <vcenter_ip_address> and udp port 902
To confirm if heartbeats are reaching vCenter over port UDP 902 every 10 seconds, you can use the following command from an SSH session to the vCenter Appliance.
tcpdump src host <esxi_host_ip_address> and udp port 902
Next steps
If heartbeats are being sent by the ESXi host, but not reaching the vCenter, the network between the two machines needs to be investigated further for firewalls or other connection limiting mechanisms.
If heartbeats are not being sent by the host, investigate the ESXi services and log files for a possible cause.
If heartbeats are both being sent by ESXi and being received by vCenter, the problem is not related to a network block of heartbeat traffic. Investigate the vCenter Server for reasons why the hosts are intermittently disconnecting.
Note: We can also use the below methods to check the UDP port 902 connectivity between vCenter and ESXi host using the below commands:
On vCenter :
tcpdump -ni eth0 host <esxi_host_ip_address> and udp port 902
The expected output should be a Heartbeat packet from the ESXi host on port 902 received on VC every 10 seconds
We can also check TCP 902 connectivity on the vCenter to ensure TCP port 902 connectivity is working fine. This is only for TCP connectivity check validation only.
curl -v telnet://<esxi_host_ip_address>:902
Important: Please ensure the customer's network between the vCenter and ESXi host UDP port 902 connectivity is set to bi-directional on their firewall.
Workaround
As a temporary workaround, you can increase the timeout limit in vCenter Server by editing or creating the Advanced Setting config.vpxd.heartbeat.notRespondingTimeout.
Note: Increasing the timeout is a short-term solution until any network issues can be resolved.
vSphere Client:
To increase the timeout limit to 120 seconds (vary as needed):
Open the vSphere Web Client or vSphere Client in a web browser and log in.
Select the vCenter object from the inventory under Hosts and Clusters.