vCenter Replication Tasks Fail Due to Stale Lookup Service Endpoints After ELM Split
search cancel

vCenter Replication Tasks Fail Due to Stale Lookup Service Endpoints After ELM Split

book

Article ID: 438271

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

After decommissioning an Enhanced Linked Mode (ELM) partnership or performing an SSO domain split, vCenter Server operations—specifically vSphere Replication (VR) and Site Recovery Manager (SRM) tasks—may fail.

Symptoms include:

  • Inability to configure new replication jobs.

  • Authentication failures or "Service Conflict" errors in the vCenter Lookup Service.

  • The presence of legacy IP addresses (e.g., <REDACTED_IP>) or FQDNs for decommissioned VR/SRM appliances in the service registry.

  • Log entries indicating vCenter cannot programmatically select a valid endpoint due to multiple conflicting registrations.

Environment

 

  • Product: vCenter Server 7.x, 8.x

  • Components: vSphere Replication, Site Recovery Manager

  • Scenario: Post-ELM decommissioning or SSO Domain separation.

 

Cause

The root cause is an incomplete decommissioning workflow. When vCenter nodes are separated or ELM is broken, associated solutions like VR and SRM must be uninstalled prior to the split. If they are not, their service registrations and solution users persist in the VMware Directory Service (vmdir). The Lookup Service continues to present these legacy endpoints as valid targets, causing routing and logic conflicts when vCenter attempts to communicate with current services.

Resolution

A manual cleanup of the SSO database and Lookup Service registrations is required.

1. Identify and Remove Stale Solution Users List the solution users to identify those associated with the decommissioned site:

Bash
 
/usr/lib/vmware-vmafd/bin/dir-cli service list

Delete the stale solution users:

Bash
 
/usr/lib/vmware-vmafd/bin/dir-cli service delete --name <STALE_SERVICE_NAME> --login [email protected]

2. Identify Stale Service IDs Generate a full list of registered services to find the Service IDs associated with the old VR/SRM instances:

Bash
 
/usr/lib/vmidentity/tools/scripts/lstool.py list --url http://localhost:7080/lookupservice/sdk --no-check-cert > /tmp/services.txt

Search /tmp/services.txt for the legacy IPs or FQDNs identified in the Issue Verification phase to find their corresponding Service ID.

3. Unregister Stale Services via MOB

  1. Open a web browser and navigate to: https://<VC_FQDN>/lookupservice/mob/?moid=ServiceRegistration&method=unregisterService

  2. Log in with [email protected] credentials.

  3. In the serviceId field, paste the Service ID identified in Step 2.

  4. Click Invoke Method.

  5. Repeat for all identified stale Service IDs (SRM and VR).

4. Restart Services Restart vCenter services to refresh the Lookup Service cache:

Bash
 
service-control --stop --all && service-control --start --all