In a VMware vSphere environment, one or more ESXi hosts unexpectedly drop into a Not Responding or Disconnected state within the vCenter Server inventory. While the management control plane is completely disrupted, virtual machines residing on the affected host continue running normally on the data plane with zero workload interruption.
The following symptoms are observed:
Reviewing the /var/run/log/hostd.log on the affected ESXi host reveals a cascade of high-frequency failed login attempts immediately followed by management heap exhaustion errors:
Hostd[XXXXXXXX]: [Originator@6876 sub=AdapterServer opID=HB-SpecSync-host-<HOST_ID>@0-xxxxxxxx-xx-xxxx sid=xxxxxxxx user=vpxuser:<no user>] AdapterServer caught exception; <<xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : xxxxx'>>, ha-root-pool, vim.ResourcePool.updateChildResourceConfiguration, <vim.version.v8_0_3_0, internal, 8.0.3.0>, [N11HostdCommon18VmomiAdapterServer19ActivationResponderE:0x000000c6c258af98]>, N7Hostsvc34HaInsufficientMemoryResourcesFault9ExceptionE(Fault cause: vim.fault.InsufficientMemoryResourcesFault
Hostd[XXXXXXXX]: --> (vim.fault.InsufficientMemoryResourcesFault) {Hostd[XXXXXXXX]: --> message = "Group vm.<RESOURCE_ID>: Requested memory minLimit ##### KB insufficient to support effective reservation ##### KB
Hostd[XXXXXXXX]: [Originator@6876 sub=Solo.Vmomi] Arg userName: Db(167) Hostd[XXXXXXXX]: --> "<SERVICE_ACCOUNT>@<DOMAIN>.local" Db(167) Hostd[XXXXXXXX]: [Originator@6876 sub=Solo.Vmomi] Arg password: Db(167) Hostd[XXXXXXXX]: --> (not shown)
Db(167) Hostd[XXXXXXXX]: --> Db(167) Hostd[XXXXXXXX]: [Originator@6876 sub=Solo.Vmomi] Arg locale Db(167) Hostd[XXXXXXXX]: --> (null)In(166) Hostd[XXXXXXXX]: [Originator@6876 sub=Solo.Vmomi] Throw vim.fault.InvalidLogin
This issue is caused by a cascading resource starvation event within the primary host management daemon (hostd). The failure mechanism develops through two compounding factors:
Ultimately, hostd becomes entirely unresponsive to external network sockets, causing vCenter Server to lose its heartbeat connection and report the host as Not Responding.
To resolve this issue and permanently restore control plane stability, you must isolate and stop the rogue external authentication stream before flushing the exhausted hypervisor management heap pools.
Once the external traffic storm has been stopped, clear the leaked thread handles and memory spaces on the affected ESXi host by cycling the local management agents.
/etc/init.d/hostd restart
/etc/init.d/vpxa restart
Operational Impact Notice: This process safely flushes all hung connection tracking states from the active system memory plane. The ESXi host will briefly disconnect from the vCenter Server inventory for less than 60 seconds before reconnecting automatically. This operation impacts only the management plane; it does not interrupt, restart, or degrade production virtual machine workloads or virtual networks.
/var/run/log/hostd.log for several minutes to verify that new fault.InvalidLogin occurrences have completely ceased.