VMware ESXi host enters a hanging state, cannot be managed via the host client GUI, and becomes inaccessible from the vCenter Server. Virtual machines residing on the host show as disconnected.
The hostd.log reports severe storage access latency and thread exhaustion: warning hostd[2102729] [Originator@6876 sub=IoTracker] In thread 2100738, access("/vmfs/volumes/xxxxxxxxxxxxxxx/catalog") took over 4297 sec. warning hostd[100366688] [Originator@6876 sub=IoTracker] In thread 2102609, fopen("/vmfs/volumes/xxxxxxxxxxxxxxxxx/XXXXXXX.XXX/abcd.vmx") took over 227 sec.
The vmkwarning.log reports FCoE fabric discovery and VLAN timeouts for the storage adapters: cpu62:2098688)WARNING: ql_fcoe:vmhba67:FipVlanTimeoutWork:254: FIP VLAN Max Retries reached, cur vlan and pri: 1002 cpu53:2098725)WARNING: ql_fcoe:vmhba69:FipDiscoveryTimeoutWork:126: Max retry exhausted for Fabric = MAC_ADDRESSES vlan_id = 0x3ea
VMware ESXi 8.0.x
VMware vCenter Server 8.x
VMware ESXi 7.0 Update 3l, build 21424296
VMware vCenter Server 7.0 Update 3l, build 21477706
The primary cause of the host hang and unresponsiveness is severe storage latency and connectivity loss on the underlying FCoE storage adapters. This causes hostd threads to block indefinitely while waiting on storage I/O, leading to management plane resource exhaustion and host disconnection.
Investigate the physical FCoE SAN fabric, upstream switches, and connected storage arrays for network congestion, link failures, or fabric misconfigurations.
Verify the health, firmware, and driver compatibility of the ESXi host's FCoE storage adapters.
Restore reliable active communication paths to the VMFS datastores from the storage array.
Once physical storage connectivity is restored and latency normalizes, restart the ESXi management agents (/etc/init.d/hostd restart and /etc/init.d/vpxa restart) to clear any remaining hung threads and allow the host to successfully reconnect to vCenter Server.
For issues specifically related to the host returning an "Error: 503 Service Unavailable" immediately following a host certificate update, please refer to https://knowledge.broadcom.com/external/article/376026/access-esxi-host-client-with-error-503-s.html