ESXi host disconnects intermittently from vCenter Server
search cancel

ESXi host disconnects intermittently from vCenter Server

book

Article ID: 318647

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

  • ESX/ESXi hosts disconnect frequently from vCenter Server.
  • vCenter Server shows ESX/ESXi host(s) as not responding
  • vCenter Server is not randomly receiving ESX/ESXi heartbeats
  • /var/log/vmware/vpxd/vpxd.log contains entries similar to:

    [<YYYY-MM-DD>T<time> verbose 'App'] [VpxdIntHost] Missed 2 heartbeats for host esx.example.com

Environment

VMware vCenter Server 6.7.x
VMware vCenter Server Appliance 5.5.x
VMware vCenter Server Appliance 6.0.x
VMware vCenter Server 7.0.x
VMware vCenter Server Appliance 6.5.x
VMware vCenter Server 6.0.x
VMware vCenter Server 5.5.x
VMware vCenter Server Appliance 6.7.x
VMware vCenter Server 6.5.x
VMware vCenter Server 8.0.x

Cause

This issue occurs when the UDP heartbeat message sent by ESX/ESXi host is not received by vCenter Server. if vCenter Server does not receive the UDP heartbeat message, it treats the host as not responding. ESX/ESXi host send heartbeats every 10 seconds and vCenter Server has a window of 60 seconds to receive the heartbeats. This behavior can be an indication of a congested network between the ESX/ESXi host and vCenter Server.

Note: If the host disconnects every 60 seconds there is likely a firewall blocking UDP 902 heartbeats from ESXi host to vCenter.

Resolution

Confirming Packet Continuity 

  1. To confirm the ESXi host is sending heartbeat packets to the vCenter every 10 seconds, use the following command from an SSH session to the ESXi host.

    tcpdump-uw dst host <vcenter_ip_address> and udp port 902

  2. To confirm if heartbeats are reaching vCenter over port UDP 902 every 10 seconds, you can use the following command from an SSH session to the vCenter Appliance.
tcpdump src host <esxi_host_ip_address> and udp port 902
 

Next steps

  • If heartbeats are being sent by the ESXi host, but not reaching the vCenter, the network between the two machines needs to be investigated further for firewalls or other connection limiting mechanisms.
  • If heartbeats are not being sent by the host, investigate the ESXi services and log files for a possible cause.
  • If heartbeats are both being sent by ESXi and being received by vCenter, the problem is not related to a network block of heartbeat traffic. Investigate the vCenter Server for reasons why the hosts are intermittently disconnecting. 
 

Workaround

As a temporary workaround, you can increase the timeout limit in vCenter Server by editing or creating the Advanced Setting config.vpxd.heartbeat.notRespondingTimeout.

Note: Increasing the timeout is a short-term solution until any network issues can be resolved.
 
vSphere Client:
 
To increase the timeout limit to 120 seconds (vary as needed):
  1. Open the vSphere Web Client or vSphere Client in a web browser and log in.
  2. Select the vCenter object from the inventory under Hosts and Clusters.
  3. Select the Manage or Configure tab.
  4. Select SettingsAdvanced Settings.
  5. Click Edit.
  6. In the Key field, type:

    config.vpxd.heartbeat.notRespondingTimeout
     
  7. In the Value field, type:

    120
     
  8. Click Add.
  9. Click OK.
  10. Restart the vCenter Server service.

    service-control --stop vmware-vpxd && service-control --start vmware-vpxd

Additional Information