ESXi host disconnects from vCenter Server due to UDP 902 heartbeat blockage
search cancel

ESXi host disconnects from vCenter Server due to UDP 902 heartbeat blockage

book

Article ID: 323612

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

This article helps identify issues with heartbeat traffic between vCenter Server and ESXi causing the host to disconnect and enter a "not responding" state.

The steps provided in this article and the included Knowledge Base article links will help identify and determine if heartbeat traffic packets between the vCenter Server and the ESXi host are being dropped or lost. The following symptoms may be observed:
  • An ESXi host disconnects from vCenter Server.
  • An ESXi host enters an unresponsive or "not responding" state.
  • After adding or reconnecting an ESXi to the vCenter Server inventory, it disconnects 30 to 90 seconds after the task completes.
  • Changing the uplink switch port VLAN information for the new IP of the ESXi host before changing the IP of the ESXi host, results in the host showing as disconnected in vCenter Server.
  • Changing the IP address of the ESXi host using the DCUI without first removing the host from the vCenter Server inventory results in the host showing up as disconnected in vCenter Server.
  • ESXi - /var/run/log/vpxa.log displays heartbeat restarts every minute:

    [YYYY-MM-DDTHH:MM:SS] In(166) Vpxa[2099748]: [Originator@6876 sub=Heartbeat opID=HostSync-host-#####-4ff474ef-5c] Started heartbeating..
    [YYYY-MM-DDTHH:MM:SS] In(166) Vpxa[2099766]: [Originator@6876 sub=Heartbeat opID=HostSync-host-#####-2ed62a9f-c0] Started heartbeating.. 

Environment

VMware vSphere ESXi 6.x / 7.x / 8.x
VMware vCenter Server 6.x / 7.x / 8.x

Cause

Heartbeat packets (UDP port 902) are dropped, blocked, or lost between the ESXi host and vCenter Server.
vCenter Server expects a heartbeat every 10 seconds; if 6 consecutive heartbeats (60 seconds) are missed, the host is marked as disconnected.
The most common reason for missed heartbeats is a firewall blocking the UDP 902 packets from being delivered.

Resolution

To troubleshoot this issue, ensure that heartbeat communications from the host to vCenter are functioning correctly and are being received by the vCenter Server.

The default port for this communication is UDP 902, but be sure to verify the configured port in the /etc/vmware/vpxa/vpxa.cfg file on the host. This file also defines the IP address which manages the host.

Confirm vCenter Server Managed IP Address

Confirm the vCenter Server managed IP address continuity throughout the environment.

  1. Determine the managed IP address of the vCenter Server:
     
    1. Connect to vCenter Server with the vSphere Client.
    2. Click Administration > vCenter Server Settings > Advanced Settings.
    3. Make a note of the IP address in the ManagedIP row.
       
  2. Determine the IP address configured for vCenter Server:

    For vCenter Server installed on a Windows Server:
     
    1. From a console or RDP session to the vCenter Server desktop, open a command prompt.
    2. Run the command:

      ipconfig
       
    3. Make a note of the IP address and ensure that it matches the managed IP address found in step 1.

    For vCenter Server Appliance:
     
    1. From a console or SSH session to the vCenter Server Appliance, open a shell prompt. For more information, see Opening a command or shell prompt.

      Note: From the console of the vCenter Server Appliance, press enter on Login.
       
    2. Run the command:

      ifconfig
       
    3. Make a note of the IP address next to inet addr: and ensure that it matches the managed IP address found in step 1.
       
  3. Determine the IP address and port that the ESXi host is using for heartbeat traffic:
     
    1. Connect to the same host using SSH.
    2. Check the vpxa.cfg file for the heartbeat traffic port by running the command:
      • On ESXi 6.x:

        grep -i serverport /etc/vmware/vpxa/vpxa.cfg


      • On ESXi 7.0U3+:

        configstorecli config current get -c esx -g services -k vpxa_solution_user_config |grep -i server_port


    3. Ensure that the port number matches the default heartbeat port of 902.
    4. Check the vpxa.cfg file for the managed IP address by running the command:
      • On ESXi 6.x:

        grep -i serverIp /etc/vmware/vpxa/vpxa.cfg

      • On ESXi 7.0U3+:

        configstorecli config current get -c esx -g services -k vpxa_solution_user_config |grep -i server_ip

    5. Ensure that the IP address matches the managed IP address found in Step 1.

      Note: If the IP address is not the same as the one noted in Step 1, see vCenter Server IP address change causes ESX hosts to disconnect.

Connectivity

  • Test connectivity between vCenter Server and the ESXi host through the heartbeat network.
  • Because the packets are sent to a UDP port, we cannot check port connectivity using netcat because the test with a UDP flag ("-u") will always succeed. Therefore we can determine if the vCenter Server is getting the heartbeat packets by running a capture on the vCenter Server itself.
    • To do so, open an SSH session to the vCenter, or otherwise connect to the appliance using a remote console, type "shell" to launch the Bash Shell prompt and run the below command:

      tcpdump src ###.###.###.### and udp port 902 -nn

      where ###.###.###.### is the management IP of the host that is disconnecting.

  • As the packets are sent only once every 10 seconds, ensure to let the above capture run for at least 10 seconds to determine if they are being received correctly.
  • Once the desired information has been gathered, the capture can be killed using "Ctrl+c".

Note: Heartbeats are only sent in the direction of host to vCenter over UDP port 902. Checking connectivity from host to vCenter over TCP 902 using netcat or similar command is expected to fail, as this is port is not needed for connectivity (though vCenter to host over TCP 902 is).

Congestion

Test network congestion:

Other Troubleshooting Areas

Additional Information