SSH attempt to unresponsive ESXi 8.0u3 host fails and DCUI errors out with "/bin/dcuiweasel: line #: can't fork: File too large"
search cancel

SSH attempt to unresponsive ESXi 8.0u3 host fails and DCUI errors out with "/bin/dcuiweasel: line #: can't fork: File too large"

book

Article ID: 382584

calendar_today

Updated On:

Products

VMware vSphere ESX 8.x

Issue/Introduction

  • Cannot enable SSH on Esxi host.
  • SSH attempts fail even though SSH has been enabled from ESXi console
  • vMotions to or from the ESXi host fail 
  • DCUI login attempts get stuck with below error:

/bin/dcuiweasel: line #: can't fork: File too large
/bin/dcuiweasel: line #: can't fork: File too large

  • Command fails to start and the following error is logged in /var/log/vmkernel.log:

YYYY-MM-DDThh:mm:ss.397Z In(18#) vmkernel: cpu##:95538## opID=c1aae###)World: ##: VC opID m1hkp###-21###-auto-gag-h5:70002###-2#-60-#### maps to vmkernel opID c1aae###
YYYY-MM-DDThh:mm:ss.397Z Wa(18#) vmkwarning: cpu##:95538## opID=c1aae###)WARNING: Sched: vm 160089##: 63##: could not create container group, status: Limit exceeded
YYYY-MM-DDThh:mm:ss.397Z Wa(18#) vmkwarning: cpu##:95538## opID=c1aae###)WARNING: Sched: vm 160089##: 63##: could not create container group, status: Limit exceeded

  • Errors similar to following are noticed in /var/run/log/hostd.log:

YYYY-MM-DDThh:mm:ss.208Z Er(163) Hostd[20996##]: [Originator@6876 sub=SysCommandPosix opID=CSMM-domain-c370915-39762-d### sid=52738### user=vpxuser] Failed to ForkExec /usr/lib/vmware/clusterAgent/bin/clusterAdmin: File too large
YYYY-MM-DDThh:mm:ss.443Z Er(163) Hostd[20996##]: [Originator@6876 sub=SysCommandPosix opID=CSMM-domain-c370915-39763-d### sid=52738### user=vpxuser] Failed to ForkExec /usr/lib/vmware/clusterAgent/bin/clusterAdmin: File too large

Note:

  • In certain scenarios, the logs are not accessible via SSH or DCUI.
  • To read the logs, login to Host Client -> Monitor -> Logs and validate the above error messages.

 

Environment

vSphere ESXi 8.0 Update 3

Cause

If a very large number of processes are started exceeding the number allowed by the system a large number of times, or if processes fail to start due to lack of their memory resource a large number of times, it may become impossible to start new processes.This issue could cause ESXi host to become unresponsive.

Resolution

This issue has been fixed in ESXi 8.0u3e. To download the same, click on this link to Broadcom Support Portal

Workaround:

Reboot the ESXi host to restore responsiveness and SSH access.