"Admission failure in path: nicmgmtd/nicmgmtd" repeated messages on the ESXi host

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

In the /var/log/vmkernel.log file of the ESXi host, you see entries similar to:

2019-08-27T10:47:19.554Z cpu36:2103136)MemSchedAdmit: 470: Admission failure in path: nicmgmtd/nicmgmtd.2103136/uw.2103136
2019-08-27T10:47:19.554Z cpu36:2103136)MemSchedAdmit: 477: uw.2103136 (13792) extraMin/extraFromParent: 130/130, nicmgmtd (836) childEmin/eMinLimit: 2535/2560
2019-08-27T10:47:23.548Z cpu40:21674051)UserDump: 3246: sfcb-intelcim: Userworld(sfcb-intelcim) coredump complete.
2019-08-27T10:47:23.558Z cpu40:21674105)WARNING: LinuxThread: 423: sfcb-intelcim: Error cloning thread: -1 (bad0117)
2019-08-27T10:47:23.558Z cpu40:21674105)WARNING: LinuxThread: 423: sfcb-intelcim: Error cloning thread: -1 (bad0117)
2019-08-27T10:47:23.558Z cpu40:21674105)WARNING: LinuxThread: 423: sfcb-intelcim: Error cloning thread: -1 (bad0117)
2019-08-27T10:47:23.558Z cpu40:21674105)WARNING: LinuxThread: 423: sfcb-intelcim: Error cloning thread: -1 (bad0117)
2019-08-27T10:47:23.559Z cpu40:21674105)User: 3143: sfcb-intelcim: wantCoreDump:sfcb-intelcim signal:11 exitCode:0 coredump:enabled
The ESXi host generates a core dumps with the format:

./var/core/sfcb-intelcim-zdump.XXX (being XXX a sequence number)
In the /var/log/hostd.log file, you see entries similar to:

2019-08-27T10:45:05.727Z info hostd[2102747] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 74848 : An application (/bin/sfcbd) running on ESXi host has crashed (429451 time(s) so far). A core file might have been created at /var/core/sfcb-intelcim-zdump.001.
019-08-27T10:44:50.738Z error hostd[2102747] [Originator@6876 sub=Hostsvc.NsxSpecTracker] Object not found/hostspec disabled
2019-08-27T10:45:05.727Z info hostd[2102747] [Originator@6876 sub=Hostsvc.VmkVprobSource] VmkVprobSource::Post event: (vim.event.EventEx) {
-->    key = 82,
-->    chainId = 357229688,
-->    createdTime = "1970-01-01T00:00:00Z",
-->    userName = "",
-->    datacenter = (vim.event.DatacenterEventArgument) null,
-->    computeResource = (vim.event.ComputeResourceEventArgument) null,
-->    host = (vim.event.HostEventArgument) {
-->       name = "ESX-East2.interntnet.dk",
-->       host = 'vim.HostSystem:ha-host'
-->    },
-->    vm = (vim.event.VmEventArgument) null,
-->    ds = (vim.event.DatastoreEventArgument) null,
-->    net = (vim.event.NetworkEventArgument) null,
-->    dvs = (vim.event.DvsEventArgument) null,
-->    fullFormattedMessage = <unset>,
-->    changeTag = <unset>,
-->    eventTypeId = "esx.problem.application.core.dumped",
-->    severity = <unset>,
-->    message = <unset>,
-->    arguments = (vmodl.KeyAnyValue) [
-->       (vmodl.KeyAnyValue) {
-->          key = "1",
-->          value = "/bin/sfcbd"
-->       },
-->       (vmodl.KeyAnyValue) {
-->          key = "2",
-->          value = "429451"
-->       },
-->       (vmodl.KeyAnyValue) {
-->          key = "3",
-->          value = "/var/core/sfcb-intelcim-zdump.001"
-->       }
-->    ],
-->    objectId = "ha-host",
-->    objectType = "vim.HostSystem",
-->    objectName = <unset>,
-->    fault = (vmodl.MethodFault) null
--> }
2019-08-27T10:45:05.727Z info hostd[2102747] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 74848 : An application (/bin/sfcbd) running on ESXi host has crashed (429451 time(s) so far). A core file might have been created at /var/core/sfcb-intelcim-zdump.001.

Note: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.

Environment

VMware vSphere 6.7.x

Cause

This issue occurs because the process nicmgmtd runs out of memory and is not able to fulfill the requests.

Resolution

This issue is resolved in VMware vSphere ESXi 6.7 Patch release ESXi670-202004002 available in My Downloads, in support.broadcom.com.

Workaround:
To work around this issue if you do not want to upgrade:

Copy the script attached restart-nicmgmtd.sh to a specific location and modify the permission to execute.

chmod +x restart-nicmgmtd.sh

Note: The attached script periodically monitors the memory usage of nicmgmtd and restarts the nicmgmtd when the nicmgmtd memory usage is nearing the memory limit.
Get the process id of the crond:

For example:

[root@myESXIhost:/vmfs/volumes/5cb1899b-e6fab534-16c9-248a0756de00/nicmgmtd] cat /var/run/crond.pid
2106975
Stop the process.

For example:

[root@myESXIhost:/vmfs/volumes/5cb1899b-e6fab534-16c9-248a0756de00/nicmgmtd] /bin/kill 2106975
Edit the cron job file to add new job to the cron daemon:

File name: /var/spool/cron/crontabs/root

For example:

[root@myESXIhost:/vmfs/volumes/5cb1899b-e6fab534-16c9-248a0756de00/nicmgmtd] vi /var/spool/cron/crontabs/root

*/5 * * * * /vmfs/volumes/datastore1/nicmgmtd/restart-nicmgmtd.sh /vmfs/volumes/datastore1/nicmgmtd/nicscript.log

-----> This would make the crond to run the restart daemon script once in every 5 minutes.

Note: The script file take an optional parameter which is the path to a log file to log script executions or failures.This is an optional parameter. If nothing is specified as parameter, it would log the stdout. This can be ignored by:

*/5 * * * * /vmfs/volumes/datastore1/nicmgmtd/restart-nicmgmtd.sh > /dev/null 2>&1
Restart the crond daemon by running this command:

/usr/lib/vmware/busybox/bin/busybox crond
Once the crond is started after the changes, the script would be in execution monitoring the eminpeak page table value of nicmgmtd. If it hits the predetermined 500k, it would restart the nicmgmtd. This way, there will not be any log spew in the vmkernel.log file.

Notes:

The changes made to the file /var/spool/cron/crontabs/root, is not persistent across reboots. So based on the , if you wish these changes to persist across reboot, then the below entries would have to be made to local.sh file.

file path: /etc/rc.local.d/local.sh

Below are the entries:
# workaround script to restart nicmgmtd when it hits the memory limit
/bin/kill $(cat /var/run/crond.pid)
/bin/echo '*/5 * * * * /vmfs/volumes/datastore1/nicmgmtd/restart-nicmgmtd.sh /vmfs/volumes/datastore1/nicmgmtd/nicscript.log' >> /var/spool/cron/crontabs/root
/usr/lib/vmware/busybox/bin/busybox crond
The location of the script file and its log file would have to be decided based on your preference.
The local.sh file gets executed only for non secure boot.

Additional Information

VMware Skyline Health Diagnostics for vSphere - FAQ

Attachments

restart-nicmgmtd.sh get_app