ESXi 6.7 host becomes unresponsive and is disconnected from the vCenter server.
book
Article ID: 344746
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms:
The hostd service gets into a hung state causing the vCenter to mark it as disconnected.
syslog.log: yyyy-mm-ddThh:mm:ssZ lwsmd: [lsass] Failed to run provider specific request (request code = 14, provider = 'lsa-activedirectory-provider') -> error = 40121, symbol = LW_ERROR_DOMAIN_IS_OFFLINE, client pid = 2172342 yyyy-mm-ddThh:mm:ssZ lwsmd: [lsass] Failed to run provider specific request (request code = 14, provider = 'lsa-activedirectory-provider') -> error = 40121, symbol = LW_ERROR_DOMAIN_IS_OFFLINE, client pid = 2172802 yyyy-mm-ddThh:mm:ssZ lwsmd: [lsass] Failed to run provider specific request (request code = 14, provider = 'lsa-activedirectory-provider') -> error = 40121, symbol = LW_ERROR_DOMAIN_IS_OFFLINE, client pid = 2172882 yyyy-mm-ddThh:mm:ssZ lwsmd: [lsass] Failed to run provider specific request (request code = 14, provider = 'lsa-activedirectory-provider') -> error = 40121, symbol = LW_ERROR_DOMAIN_IS_OFFLINE, client pid = 2175106
yyyy-mm-ddThh:mm:ssZ lwsmd: [netlogon] CLDAP timed out: Domain01.local yyyy-mm-ddThh:mm:ssZ lwsmd: [netlogon] CLDAP timed out: Domain02.local yyyy-mm-ddThh:mm:ssZ lwsmd: [lsass] Could not transition domain 'Domain.local' to ONLINE state. Error 2453 2021-05-19T10:43:40Z lwsmd: [lsass] Found domain 'Domain.local' to be offline while resolving its objects.
Environment
VMware ESXi 6.7.x
Cause
This issue occurs when the host is unable to reach the domain controller, leading to exhaustion of the hostd memory.
Resolution
The following ports must be accessible as prerequisites: 88, 139, 389, and 445.
VMware Engineering Team is working to improve the behavior of the likewise agent in vSphere 6.7 in such situation whereas this issue is mitigated already in vSphere 7.x
Ensure that the following ports (both UDP and TCP) are open for communication between the ESX/ESXi host and Active Directory:
Port 88 - Kerberos authentication
Port 123 – NTP
Port 135 - RPC
Port 137 - NetBIOS Name Service
Port 139 - NetBIOS Session Service (SMB)
Port 389 - LDAP
Port 445 - Microsoft-DS Active Directory, Windows shares (SMB over TCP)
Restart the hostd service to bring it back online.
/etc/init.d/hostd stop
/etc/init.d/hostd start
Please validate if we can reach the domain controller is reachableif not validate the Physical Network to see if any firewall is blocking the ports to resolve the issue permanently
time nc -zv <DC_IP> 88
time nc -zv <DC_IP> 389
time nc -zv <DC_IP> 445
Additional Information
Impact/Risks:
There is no Impact on the Virtual machines that are powered ON and running on the ESXi host
The only impact is that the ESXi Host is unmanageable via the vCenter and Direct Host Client login