The hostd service intermittently becomes unresponsive
search cancel

The hostd service intermittently becomes unresponsive

book

Article ID: 318488

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • ESXi hosts shows not-responding on the vCenter.
  • The hostd service intermittently becomes unresponsive.
  • In vmkernel.log you will see alerts such as

   YYYY-MM-DDTHH:MM:SS.Z cpu2:2179102)ALERT: hostd detected to be non-responsive
   YYYY-MM-DDTHH:MM:SS.Z cpu7:2098766 opID=b3d57165)FS3J: 3146: Aborting txn (0x430aa50d2890) callerID: 0xc1d00002 due to failure pre-committing: Optimistic lock acquired by another host.  
   YYYY-MM-DDTHH:MM:SS.Z cpu7:2097782)DVFilter: 6054: Checking disconnected filters for timeouts
   YYYY-MM-DDTHH:MM:SS.Z cpu6:2202506)DLX: 4330: vol 'datastore', lock at 188628992: [Req mode 1] Checking liveness:
   YYYY-MM-DDTHH:MM:SS.Z cpu6:2202506)[type 10c00002 offset 188628992 v 4266, hb offset 3346432 gen 7313, mode 1, owner 5fa01ff9-a25c7506-a069-00108682abde mtime 544318 num 0 gblnum 0 gblgen 0 gblbrk 0]


Note: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.

Environment

VMware vSphere ESXi 7.0.0
VMware vSphere ESXi 6.7

Cause

In rare cases, a race condition of multiple threads attempting to create a file and remove the directory at the same directory might cause a deadlock that fails the hostd service. Such a deadlock might affect other services as well, but the race condition window is small, and the issue is not frequent.

Resolution

This issue has been resolved in the following releases:

Workaround:
 
Use any of the below workarounds
  • Identify if there is any stale dvport files. For example, from vCenter, the vDS may only contain 100 ports but there may be 200 dvport files under the .dvsData/DVS UUID/ directory.
  • If there are stale dvport files, unregister virtual machines from vCenter residing on that datastore then delete the .dvsData folder and after 5 minutes, they will be regenerated.

    Or

    Move the ESXi hosts to different vDS.

Additional Information

For more information refer to When removing a VMFS, is it safe to remove the .dvsData folder and subfolders?



Refer also to vDS config location and HA blog

Impact/Risks:
The service restores only after a restart of the ESXi host.