/var/log/vmware/sso/vmware-identity-sts.log file on impacted_vcenter indicates repeated authentication failures from the SRM appliance at the remote site. Key log entries point to login failures for the SRM solution user, referencing the service account SRM-51dd790f-####-####-####[email protected]. ERROR sts[68:tomcat-http--31] [CorId=fc63c4fd-3a44-4d09-b7fd-50497738de44] [com.vmware.identity.idm.server.IdentityManager] Failed to checkUserAccountFlags principal [SRM-51dd790f-####-####-####[email protected]] for tenant [vsphere.local]yyyy-mm-ddThh:mm:ss INFO sts[68:tomcat-http--31] [CorId=fc63c4fd-3a44-4d09-b7fd-50497738de44] [com.vmware.identity.diagnostics.VmEventAppender] EventLog: source=[VMware Identity Server], tenant=[vsphere.local], eventid=[USER_NAME_PWD_AUTH_FAILED], level=[ERROR], category=[VMEVENT_CATEGORY_STS], text=[Failed to authenticate principal [SRM-51dd790f-####-####-####[email protected]]. Login failed], detailText=[Login failed], corelationId=[fc63c4fd-3a44-4d09-b7fd-50497738de44], timestamp=[1753847425240]yyyy-mm-ddThh:mm:ss
Evidence from logs confirms LDAP and replication-related failures:
/var/log/vmware/vmdir/vdcrepadmin.log indicate unavailability of partner status:yyyy-mm-ddThh:mm:ss:t@140239405298816:WARNING: VmDirGetReplicationPartnerStatus, partner (impacted_vcenter) status not available (53)
/var/log/vmware/vmdird/vmdird.log confirm missing replication agreements and vmdird not in NORMAL state:yyyy-mm-ddThh:mm:ss:t@139742532335168:ERROR: VmDirIsHostAPartner: No replication agreement entries found under cn=impacted_vcenter,cn=Servers,cn=default-site,cn=Sites,cn=Configuration,dc=VSPHERE,dc=LOCALyyyy-mm-ddThh:mm:ss:t@139742532335168:ERROR: VmDirIsHostAPartner failed. Error(1168)
yyyy-mm-ddThh:mm:ss:t@139868336244288:ERROR: _VmDirSearchPreCondition: Server in not in normal mode, not allowing outward replication.yyyy-mm-ddThh:mm:ss:t@139868336244288:ERROR: VmDirSendLdapResult: Request (Search), Error (LDAP_UNWILLING_TO_PERFORM(53)), Message (Server in not in normal mode, not allowing outward replication.), (0) socket (ip_address)
/var/log/vmware/lookupsvc/lookupserver-default.log captures repeated LDAP server connection failures on port 389:yyyy-mm-ddThh:mm:ss pool-2-thread-65 ERROR com.vmware.vim.lookup.impl.LdapStorage] LDAP action failed; host=, port=389impacted_vcentercom.vmware.sso.interop.ldap.ServerDownLdapException: Can't contact LDAP server
These logs confirm that the vCenter’s vmdird service could not establish LDAP connections over port 389 due to network port block or misconfiguration, ultimately leading to replication breakage and SRM authentication failures.
Network Remediation:
Engage the Network team to ensure TCP port 389 is open bidirectionally between the vCenters/PSC nodes. This port is critical for LDAP communication used in vmdird replication.
Restore vmdird Service to Normal Mode:
SSH into the impacted vCenter as root and run the vdcadmintool:/usr/lib/vmware-vmdir/bin/vdcadmintool
Select option 5: "Set vmdir state to normal"
Confirm that the vmdird service enters NORMAL state and replication resumes.
If vmdird state change fails:
Refer to KB article for FixPSC script: Fix PSC/vmdir inconsistencies using fixpsc python script
This script helps to restore replication agreements and fix embedded PSC inconsistencies.
If issue persists kindly open a support request with Broadcom Support for further assistance.