SRM-Server service fails to start with after converging to VMware Live Recovery 9.0.5 and rolling back to 9.0.2
search cancel

SRM-Server service fails to start with after converging to VMware Live Recovery 9.0.5 and rolling back to 9.0.2

book

Article ID: 440120

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

After performing a rollback from an upgrade to VMware Live Recovery (VLR) 9.0.5.x back to Site Recovery Manager (SRM) 9.0.2.2, the srm-server service fails to start.

The following symptoms are observed:

  • The SRM/VAMI interface may show the service as "Stopped" or "Starting."
  • The Site Recovery plugin in the vSphere Client indicates the site is inaccessible.
  • In the vmware-dr.log, you see the following errors:
    • N9SsoClient27InvalidCredentialsExceptionE Authentication failed: Invalid credentials
    • Not initialized: VcConnectionHandler STS
    • SSO connection down

Environment

  • vCenter Server
  • Site Recovery Manager: Rolling back from 9.0.5.x to 9.0.2.x
  • VMware Live Recovery: 9.0.5.x

Cause

This issue occurs because the VLR convergence process modifies external registrations within the vCenter Lookup Service and Extension Manager to transition the appliance identity to the VMware Live Recovery framework.

While powering on the old virtual machines reverts the local SRM appliance files and credentials to version 9.0.2, it does not revert the global registrations stored in the vCenter Server database. The vCenter continues to expect the converged 9.0.5 identity, causing the 9.0.2 appliance to be rejected by the Security Token Service (STS) due to a credential and thumbprint mismatch.

Resolution

To restore the service, you must manually purge the converged 9.0.5 registrations from vCenter to allow the 9.0.2 appliance to re-register its original identity.

Step 1: Unregister Stale vCenter Extensions

  1. Navigate to https://<vCenter_Server_FQDN>/mob.
  2. Log in as [email protected].
  3. Navigate to Content > ExtensionManager.


  4. Click UnregisterExtension.
  5. Enter and unregister the following extension IDs one by one:
    • com.vmware.vsan.snapshot.manager.client
    • com.vmware.vcDr
    • com.vmware.vcAps
    • com.vmware.drui.plugin
    • com.vmware.vsan.snapshotservice
    • com.vmware.vcHms

Step 2: Purge Lookup Service Registrations

  1. SSH into the vCenter Server Appliance (VCSA).
  2. Identify the Service IDs associated with the SRM appliance hostname: Review this command before running it.
    /usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert | grep "<SRM_Appliance_Hostname>" -B9 | grep "Service ID" | awk '{print $3}'
  3. Navigate to the Lookup Service MOB: https://<vCenter_Server_FQDN>/lookupservice/mob/?moid=ServiceRegistration&method=Delete
  4. Enter the Service ID(s) found in step 2 and click Invoke Method.

Step 3: Reconfigure SRM Appliances

  1. Log in to the SRM VAMI at https://<SRM_Appliance_IP>:5480.
  2. Navigate to the Summary tab and select Reconfigure.
  3. Complete the wizard to re-register the 9.0.2 appliance.
  4. Note: If the reconfiguration hangs while stopping the service, SSH into the SRM appliance and run: This command will make changes to your system. Review it carefully before running.
    sudo systemctl kill -s SIGKILL srm-server


NOTE: Repeat all resolution steps for the remote site

Additional Information

The convergence process to VMware Live Recovery is a significant architectural change. Always ensure a file-based backup of vCenter is available before initiating convergence. If a rollback is required, these manual cleanup steps are mandatory to restore the original SRM 9.0.2 communication path.