ESXi host is non-responsive and disconnected in vCenter
search cancel

ESXi host is non-responsive and disconnected in vCenter

book

Article ID: 340041

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

To troubleshoot the hostd service.

Symptoms:
  • Hostd is non-responsive and is disconnected in vCenter
  • Host status in vCenter Server is not responding
  • In the vmkernel.log file, you see entries similar to:
2017-07-23T19:10:39.467Z info hostd[DF80B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 727 :  Lost access to volume 595a73d1-1e8d8173-6a8f-a0369f2eef38 (DX100#02_SSD-01) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
2017-07-23T19:10:39.468Z info hostd[D3C2B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 728 : Lost access to volume 595ab527-f3df823e-f747-a0369f2eef38 (DX100#02_SAS-02) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly. 
  • In the Hostd.log file, you see entries similar to:
hostd.0
==================
2017-08-02T18:55:49.026Z info hostd[BA81B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 526 : Lost access to volume 597991cd-2ae53f79-db0f-b49691017b10 (DX100#02_SAS-02) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
2017-08-02T18:55:49.026Z info hostd[BA81B70] [Originator@6876 sub=Hostsvc.VmkVprobSource] VmkVprobSource::Post event: (vim.event.EventEx) {
--> key = 0,
--> chainId = 0,
--> createdTime = "1970-01-01T00:00:00Z",
--> userName = "",
--> datacenter = (vim.event.DatacenterEventArgument) null,
--> computeResource = (vim.event.ComputeResourceEventArgument) null,
--> host = (vim.event.HostEventArgument) {
--> name = "bhsesx01",
--> host = 'vim.HostSystem:ha-host'
  • In the vobd.log file, you see entries similar to:
==================
2017-07-30T02:24:58.236Z: [scsiCorrelator] 420484207165us: [esx.problem.storage.redundancy.lost] Lost path redundancy to storage device naa.600000e00d29000000291e5b00000000. Path vmhba1:C0:T0:L0 is down. Affected datastores: "DX100#01_SSD-01".
  • There are other issues which can cause a hostd to go unresponsive
The scope of this document is only to troubleshoot hostd unresponsivesness.


Environment

VMware vSphere ESXi 6.5
VMware vSphere ESXi 6.0
VMware vSphere ESXi 6.7

Cause

This situation is usually due to certain services on the host that are not running properly. While there can be other reasons ,unresponsive host services are frequently caused by the host running out of available resources . An ESXi host will shut down resource-intensive services like ‘hostd’ rather than take from what is allocated to running VMs, if they need it.   

   
 The vCenter cannot communicate with the host both due to lack of these services as well as general difficulty with a lagging host that doesn’t have enough available resources to function properly. The result is a host that fails to respond to vCenter and is normally slow and difficult to access or manage in other ways. In these cases, the host’s storage is still fully functional within the datastore and its VMs will usually continue running almost normally for a period of time.

 

Resolution


The scope of this document is only to troubleshoot hostd unresponsivesness.

Workaround:
  1. Detect the non-responsive hostd:
    1. Check hostd detected to be non-responsive alert message in the vmkernel* log.
    2. Check host-probe* logs and locate timeout messages or hostd log not getting updated.
NOTE: Restarting the management agents may impact any tasks that are running on the ESXi host at the time of the restart
Check for any storage issues before restarting the Host deamon hostd service or services.sh

Refer to Restarting the Management agents in ESXi (1003490)
      2.    Stop hostd using service or hostd command. For more information, see:      3. Alternatively VM shutdown method if you have command line available through Putty or DCUI shell to the host and can’t access the VMs directly for some reason. See Unable to Power off a Virtual Machine in an ESXi host  
   
     
        Command to see if a VM is running on a ESXi host and get the World ID: # localcli vm process list     
     
        Command to shut a VM down: # localcli vm process kill -t soft -w <worldID>         
   
   
    *Using 'soft', as above, is the most graceful shutdown. If that doesn't work, use 'hard' instead to perform an immediate shutdown. The option 'force' should be used as a last resort.

NOTE: It is important that any underlying storage issue is if fixed for hostd service to respond properly.

Additional Information

For more information, see:
For more information on avoiding common storage related issues, see