"Error synchronizing user information with AD/LDAP " alert in Aria Operations
search cancel

"Error synchronizing user information with AD/LDAP " alert in Aria Operations

book

Article ID: 370781

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Receiving Administrative Alerts: "Error synchronizing user information with AD/LDAP"
  • While the Primary node may successfully bind to AD, Data Nodes involved in the UI or API request chain fail to resolve the Domain Controller's FQDN, leading to sync timeouts and login failures
  • Analytics Log show occasional entries such as the one below:

 ERROR [DistTaskSync-########-####-####-####-########ce3c]  com.vmware.vcops.auth.server.ldap.Sync.run - Groups sync for ldap: <YourLDAPServer> failed: Unable to fetch users in usergroups. Reason: <YourLDAPServerAddress>:636
com.vmware.vcops.auth.exception.AuthException: Unable to fetch users in usergroups. Reason: <YourLDAPServerAddress>:636
    at com.vmware.vcops.auth.server.ldap.LdapQueryHelper.getUsersInGroups(LdapQueryHelper.java:280) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.Sync.fetchLatestGroups(Sync.java:512) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.Sync.doSyncWithRetry(Sync.java:105) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.Sync.run(Sync.java:83) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.platform.distributedtask.DistributedTaskExecutor$TaskProcessorThread.run(DistributedTaskExecutor.java:576) ~[alive_platform.jar:?]
    at com.integrien.alive.common.util.BaseThread$BaseThreadRunnable.run(BaseThread.java:177) ~[vrops-adapters-sdk.jar:?]
    at java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: javax.naming.CommunicationException: <YourLDAPServerAddress>:636
    at com.sun.jndi.ldap.Connection.<init>(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapClient.<init>(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapClient.getInstance(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtx.connect(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtx.<init>(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxFromUrl(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(Unknown Source) ~[?:?]
    at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(Unknown Source) ~[?:?]
    at javax.naming.spi.NamingManager.getInitialContext(Unknown Source) ~[?:?]
    at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source) ~[?:?]
    at javax.naming.InitialContext.init(Unknown Source) ~[?:?]
    at javax.naming.ldap.InitialLdapContext.<init>(Unknown Source) ~[?:?]
    at com.vmware.vcops.auth.server.ldap.LdapUtil.getLdapContext(LdapUtil.java:349) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.LdapUtil.createContext(LdapUtil.java:258) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.LdapUtil.createContext(LdapUtil.java:201) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    at com.vmware.vcops.auth.server.ldap.LdapQueryHelper.getUsersInGroups(LdapQueryHelper.java:250) ~[vcops-auth-server-1.0-SNAPSHOT.jar:?]
    ... 6 more
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)

 

Environment

Aria Operations 8.10.x and above

Cause

  • One of the AD/LDAP DC controllers cannot be reached from the Aria Operations Analytics nodes. When the LDAP load balance direct to that DC it will produce an error, but when it points to other DCs the sync works properly.

  • DNS misconfiguration in Data Node. In a multi-node Aria Operations cluster, all nodes must be able to resolve the Authentication Source FQDN. In this instance, the /etc/resolv.conf file on the Data Node contained incorrect nameserver entries or search domains compared to the Primary node, preventing the node from locating the Domain Controller.

Resolution

Option 1:

Find Source of time out:

  1. SSH into the primary node.
  2. Run command: nslookup <YourLDAPServer>   This will give you a list of DCs used by LDAP
  3. Then run the Command: curl -vk <DC.IP.>:636 

Example: Success

*   Trying <IP address>:636...
* Connected to <IP address> (<IP address>) port 636 (#0)
> GET / HTTP/1.1
> Host: <IP address>:636
> User-Agent: curl/8.1.2
> Accept: */*
Press “ctrl +c” to exit the command
If you receive a timeout error, you have found the problematic DC.
Check firewall and networking rules to diagnose the problem further with your networking team.

Option 2 :

  • Log into the Primary Node via SSH and record the contents of /etc/resolv.conf.
  • Log into the Data Node(s) and compare their /etc/resolv.conf files.
  • Open the file for editing: vi /etc/resolv.conf.
  • Update the nameserver entries to match the working Primary node.
  • Ensure the search domain includes the domain where the AD controllers reside.