Unable to log in to a specific host via AD
search cancel

Unable to log in to a specific host via AD

book

Article ID: 422719

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Cannot log in to a host via AD as the specific host cannot be login into via the AD even when the ESXI is joined to the domain via a group.
  • On the host UI the following error messages are observed:
    permission denied
    permission to perform operations denied
    operation timed out.

    /var/log/likewise.log:
    YYYY-MM-DDTHH:MM:SS DEBUG netlogon: LWNetGetDCNameExt():dcinfo.c:###: Error at ../netlogon/client/dcinfo.c:### [code: ####]
    YYYY-MM-DDTHH:MM:SS DEBUG netlogon: LWNetSrvGetDCName():dcinfo.c:###: Looking for a DC in domain '<domain_name>', site '<null>' with flags ###
    [..]
    YYYY-MM-DDTHH:MM:SS ERROR netlogon: CLDAP timed out:  domain_controller.domain
    YYYY-MM-DDTHH:MM:SS ERROR lsass: LSA User Manager - unable to determine whether users have logged off.
    YYYY-MM-DDTHH:MM:SS ERROR lsass: Error while checking user refresh credentials list: #####
    YYYY-MM-DDTHH:MM:SS INFO netlogon: Filtering list of 2 servers with list of 0 black listed servers
    YYYY-MM-DDTHH:MM:SS ERROR netlogon: CLDAP timed out: domain_controller.domain

    /var/run/log/hostd.log:
    YYYY-MM-DDTHH:MM:SS In(166) Hostd[PID]: [Originator@PID sub=Solo.VmwareCLI opID=<Host_FQDN> sid=SIDca user=root] Dispatch system.permission.set done
    YYYY-MM-DDTHH:MM:SS In(166) Hostd[PID]: [Originator@PID sub=Vimsvc.ha-eventmgr opID=<Host_FQDN> sid=SID] Event PID : Cannot login user <username>\<domain>@<IP_ADDR>: no permission

Environment

VMware vSphere ESXi 8.x

Cause

The 389 port on the host with the issue could still be blocked as ESXi does not "scan" the network as it strictly asks the DNS Server for a list of available Domain Controllers.

Resolution

Preform the following checks with your internal networking team to identify and fix the issue:

  1. Prove the Source (The DNS Check) by running the below command on the ESXi host which should reveal exactly what list of Domain Controllers the DNS server is feeding to the ESXi host:
    nslookup -type=SRV _ldap._tcp.dc._msdcs.<YOUR_DOMAIN_NAME>
    • Analyze the Output in regards to if the list contain the "QA" servers?
  2. Verify Site-Specific Lookup by checking if the AD knows which "Site" the ESXi host belongs using the command: /usr/lib/vmware/likewise/bin/lw-get-dc-name <YOUR_DOMAIN_NAME>
    • Look for "Client Site Name":
      • If it says Default-First-Site-Name or acts generic, the IP Subnet is likely undefined in AD.
      • It should show a specific site name like Production-Site or HQ-Site.
  3. Fix for the AD Admin involves the Active Directory Administrator performing the following below:
    • Open Active Directory Sites and Services.
    • Navigate to Subnets.
    • Ensure the specific IP Subnet of this ESXi host is created.
    • Associate that subnet with the correct Production Site object.
    • Once this is done, AD will stop sending the "QA" Domain Controllers in the DNS response to this host.
  4. Verify UDP Connectivity:
    • Confirm if UDP traffic is allowed to the DCs by running the command on the problematic host
      nc -u -z -v -w 5 <IP_OF_UKDC-01> 389 nc -u -z -v -w 5 <IP_OF_UKDC-02> 389
      Where -w 5 flag prevents hanging
      Success: Returns Connection to ... 389 port [udp/ldap] succeeded!
      Failure: Returns netcat: connect: Connection refused or times out.
    • Increase Timeout (Workaround) Even if connectivity exists, network latency may be intermittently exceeding the default 5-second timeout. Increasing this to 15 seconds will stabilise the domain state:
      /usr/lib/vmware/likewise/bin/lwregshell set_value "[HKEY_THIS_MACHINE\Services\lsass\Parameters\Providers\ActiveDirectory]" "LdapProbeTimeout" 15
      Where 15 is the Timeout in seconds
    • Restart Agent:
      /etc/init.d/lwsmd restart
    • Verify Stability: Retry the group list command as it should constantly return the 203 groups without flapping to 0.
      /usr/lib/vmware/likewise/bin/lw-lsa list-groups-for-user <User_Account>
  5. Check that the firewall is updated with no blocks to both port 389 and port 88.