ESXi running on patch ESXi670-202004002 becomes non-responsive on the vCenter
search cancel

ESXi running on patch ESXi670-202004002 becomes non-responsive on the vCenter

book

Article ID: 317948

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi host to stay connected in vCenter.

Symptoms:

  • ESXi host becomes non-responsive on the vCenter. Restarting management agents makes it responsive but it becomes unresponsive again within 12 hours.
  • This occurs on ESXi host running on patch ESXi670-202004002  although there are other hosts on same patch which are not affected.
  • In the vmkernel.log you see similar entries

yyyy-mm-ddThh:mm:ss cpu65:2146227)MemSchedAdmit: 471: Admission failure in path: hostd-probe/stats/sh/sh.2146228/uw.2146228
yyyy-mm-ddThh:mm:ss cpu65:2146227)MemSchedAdmit: 478: UserWorld 'sh' with cmdline 'unknown'
yyyy-mm-ddThh:mm:ss cpu65:2146227)MemSchedAdmit: 489: uw.2146228 (157415) extraMin/extraFromParent: 511/511, hostd-probe (795) childEmin/eMinLimit: 6829/7168
yyyy-mm-ddThh:mm:ss cpu65:2146227)WARNING: LinuxThread: 423: sh: Error cloning thread: -28 (bad0081)
yyyy-mm-ddThh:mm:ss cpu21:2098169)DVFilter: 6054: Checking disconnected filters for timeouts
yyyy-mm-ddThh:mm:ss cpu30:2146113)ALERT: hostd detected to be non-responsive

  • You will see  hostd-probed dump files generated during periods of unresponsiveness like below

            hostd-probed-2135463-
      hostd-probed-2153290-
      hostd-probed-2165244-
      hostd-probed-2099914-


Note:The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment.

Environment

VMware vSphere ESXi 6.7

Cause

hostd starts a child process at the same time it is opening/closing a file causing a hung child processes to build up and fill the resource pool, resulting in memory admissions failures.

Resolution

This is a known issue which is fixed in 6.7 po4

Ref:

https://docs.vmware.com/en/VMware-vSphere/6.7/rn/esxi670-202008001.html

Workaround:
If VMware support analysis from hostd-zdump  shows hung process on "IoTrackers" then apply the below workaround

  • Add the following line inside /etc/vmware/hostd/config.xml file:

       <ioTrackers> false </ioTrackers>

Note: This XML element should be put directly under the root <config> element.


Additional Information

Impact/Risks:
No impact