Troubleshooting an ESXi host in a "not responding"/"disconnected" state
search cancel

Troubleshooting an ESXi host in a "not responding"/"disconnected" state

book

Article ID: 344682

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

This article provides steps to troubleshoot when an ESXi host is in disconnected or not responding state in vCenter. In addition, it provides steps to help eliminate common causes by verifying the network configuration and management server agents are correct and confirm the availability of resources on each ESXi host.

  • An ESXi host shows as "not responding" or "disconnected" state in vCenter.
  • On the vCenter UI --> ESXi host --> Monitor --> Event, the message: "Cannot synchronize host <fqdn>" is seen.
  • Cannot connect the ESXi host to vCenter Server.
  • Virtual machines on an ESXi host show as greyed out in vCenter.
  • While accessing Direct Console (DCUI) and if we toggle between DCUI console and banner screen (For more information refer Accessing DCUI/Console of ESXi using ALT+F Keys)  we get realtime errors of hostd agent as below. "hostd detected to be non-responsive"

  • Attempting to add an ESXi to vCenter, fails with the following

    Unable to access the specified host, either it doesn't exist, the server software is not responding, or there is a network problem
     
  • In the VC , /var/log/vmware/vpxd/vpxd.log file:

<YYYY-MM-DD>T<time> [08128 info 'vpxdvpxdMoHost' opID=########-########-##] [HostMo] host connection state changed to [DISCONNECTED] for host-ID
<YYYY-MM-DD>T<time> [04944 error 'vpxdvpxdInvtHostCnx' opID=HB-host-ID@####-########] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-ID, marking host as notResponding
<YYYY-MM-DD>T<time> [00812 error 'vpxdvpxdInvtHostCnx' opID=HB-host-ID@####-########] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-ID, marking host as notResponding


For more information, refer Location of vCenter Server log files

  • Unbale to ping the default gateway of the ESXi host from DCUI when the ESXi host is in disconnected state as shown below:

Environment

  • vCenter 7.x
  • vCenter 8.x
  • vCenter 9.x
  • ESXi 7.x
  • ESXi 8.x
  • ESX 9.x

Cause

There can be multiple causes of ESXi host being in Not-Responding/Disconnected state.

For Example:

  • If hostd agent itself is unresponsive.(Hostd agent can be unresponsive due to multiple underlying factors that can be further investigated through ESXI logs)
  • Network issues.(Ping loss, gateway reachability issue, LACP(lag) configuration issue, management network connectivity issue)
  • Storage issues.

Resolution

Important: Before proceeding review KB Article Understanding the difference between "Not Responding" and "Disconnected" ESXi hosts in VMware vCenter Server.

As there are a number of reasons why the ESXi host reaches a “Not Responding” state, Broadcom strongly recommends to:

  • Validate each troubleshooting step below.
  • Each step provides instructions or a link to another Knowledge Base article to eliminate possible causes and corrective action to resolve issue.  

Note: The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. After each step is completed, try to reconnect the ESXi host back to vCenter Server. Do not skip any step.

  1. Verify that the ESXi host is accessible from vCenter or vSphere Client.For more information, see ESXi hosts do not respond and is grayed out 

  2. Verify that the ESXi host can be reconnected, or if reconnecting the ESXi host resolves the issue. For more information, see Changing an ESXi host's connection status in vCenter Server / unable to reconnect ESXi host back to vCenter Server.

  3. Verify that the ESXi host is able to respond back to vCenter at the correct IP address. If vCenter Server does not receive heartbeats from the ESXi host, it goes into a not responding state. To verify if the correct managed IP Address is set, see Verifying the vCenter Server Managed IP Address and  See also, ESXi host disconnects from vCenter Server after adding or connecting it to the inventory and ESXi host disconnects intermittently from vCenter Server.

  4. Perform packet captures while initiating continuous pings to the default gateway of the ESXI host in DCUI and check the drop point of packets in the network.Refer Refer KB:Packet capture on ESXi using the pktcap-uw tool

  5. Based on the packet captures performed in point 4 if the packets are dropping in the physical network, to resolve the issue engage physical network team/administrator.

  6. Verify that network connectivity exists from vCenter to the ESXi host with the IP and FQDN. For more information, see Testing network connectivity with the ping command.

  7. Verify connectivity from vCenter to/from the ESXi host on TCP/UDP port 902 and 443. For more information, see Testing port connectivity with Telnet.

  8. Verify if restarting the ESXi Management Agents resolves the issue. For more information, see Restarting Management Agents in ESXi.

  9. Verify if the hostd process has stopped responding on the affected ESXi host. For more information, see Troubleshooting the hostd service if it fails or stops responding on an ESXi host

  10. Verify if the ESXi host has experienced a Purple Diagnostic Screen. For more information, see Interpreting a host purple diagnostic screen

  11. ESXi hosts can disconnect from vCenter due to underlying storage issues. For more information, see Identifying Fibre Channel, iSCSI, and NFS storage issues on ESXi hosts.

  12. ESXi host "Cannot contact the specified host hostname\IP. For more information, see The host appears as disconnected in vCenter

  13. Verify the LACP (lag) configuration in vCenter for the affected ESXi host. For more information, see Configuring a LAG on a vSphere Distributed Switch Port Group when using LACP.

  14. Verify that vpxa.cfg does list the vCenter server IP address. Ref: vSphere ESXi 7.0 U3 and later version VPXA configuration properties.

  15. Verify that there is no MTU mismatch between the management vmk till vCenter including physical network. 

  16. Verify that ESXi version is not higher than the vCenter version.

  17. Verify that there is no duplicate IP found in vobd.log.

Additional Information

For issues similar or related to ESXi hosts not responding, or other troubleshooting steps:


ESXi host disconnects intermittently from vCenter Server
Verifying the VMware vCenter Server Managed IP Address
TCP and UDP Ports required to access VMware vCenter Server, VMware ESXi and ESX hosts, and other network components
Setting link state up or down for a vmnic interface on ESXi
Unexpected ESXi Reboot or Shutdown
ESXi host disconnects from vCenter Server after adding or connecting it to the inventory


Impact/Risks:
Refer to each section to understand the impact and risks for those relevant actions.