Troubleshooting the hostd service if it fails or stops responding on an ESXi host
search cancel

Troubleshooting the hostd service if it fails or stops responding on an ESXi host

book

Article ID: 316598

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • vpxd.log

    Authd error: 514 Error connecting to hostd-vmdb service instance.
    Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11).
     
  • hostd.log

    2014-06-27T19:57:41.000Z [282DFB70 info 'Vimsvc.ha-eventmgr'] Event 8002 : Issue detected on esxi.example.com in ha-datacenter: hostd detected to be non-responsive
     
  • vCenter Server errors

    Unable to access the specified host. It either does not exist, the server software is not responding, or there is a network problem.
     
  • When attempting to add or reconnect the host to vCenter Server, the following error may appear:

    VMware Infrastructure Client could not establish the initial connection with server your server. Details: A connection failure occurred.
     
  • When attempting to connect directly to the ESXi host, the following error may appear:

    Unable to access the specified host. It does not exist, the server software is not responding, or there is a network problem.



Environment

VMware vSphere ESXi

Resolution

The hostd management service is the main communication channel between ESXi hosts and the VMkernel. If the hostd service fails, the ESXi host will disconnect from vCenter Server and attempts to connect to the ESXi host directly will fail.

To resolve this issue, validate that each troubleshooting step below is valid for the environment. The steps provide instructions or a link to a document to validate the step and take corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. After each step, attempt to restart the management agents. Be sure to complete all of the steps.

Note: For information on restarting ESXi host services, see Restarting the Management agents in ESXi.

When the hostd service fails to respond

  1. Verify network connectivity to the ESXi host. For more information, see Testing network connectivity with the ping command.
     
  2. Verify that the hostd service is running. 

    /etc/init.d/hostd status
     
  3. Verify that either ports 80 or 443 are open by running these commands:

    esxcli network ip connection list

    For more information, see Determining if a port is in use by an application or process on a virtual machine.
     
  4. Verify that the /etc/hosts file is written correctly and has entries similar to:

    # Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1 <localhost>.<localdomain> <localhost>
    10.0.0.1 <server>.<domain> <server>

     
  5. Verify the ESXi host disk partitions have available disk space. If either / or /var/log is full, then hostd cannot start because it is trying to write information to a full disk. For more information on disk space usage on the ESXi host, see Investigating disk space on an ESX or ESXi host.
     
  6. Verify that there is SAN connectivity and that SAN has been properly added or removed by running this command:

    ls /vmfs/volumes

    or

    vdf -h

    If the commands take a very long time to complete or report an error, see Identifying shared storage issues with ESX or ESXi.
     
     
  7. Verify that CPU usage is below 90%, by running this command:

    esxtop

    For more information regarding esxtop, see Using the esxtop Utility.
     

If additional assistance is required for any of the above steps, file a support request with VMware Support and note this KB Article ID (316598) in the problem description.

When the vmware-hostd service fails to start

If the hostd service fails to start, perform these troubleshooting steps:
  1. Check for failed Network File System (NFS) or Server Message Block (SMB) mounts on the ESXi host. If there are failed NFS or SMB mounts, disable or remove the mounts and restart the host's management services.
      
  2. Check for corruption of virtual machine configuration files. For more information, see Re-registering orphaned virtual machines.
     
  3. Check for corruption of the /etc/vmware/hostd/config.xml by looking for blank hostd logs.
     
  4. Run these commands to restart the hostd service:

    /etc/init.d/hostd status
    /etc/init.d/hostd start
    /etc/init.d/hostd stop

    If a third-party monitoring applications is using port 9080, you may see these error messages:

    ['Solo' 3076436096 info] Micro web server port: 9080
    ['App' 3076436096 panic] Application error: Address already in use
    ['App' 3076436096 panic] Backtrace generated