dr.fault.ServiceNotFound error when running VLSR/VR VAMI configure
search cancel

dr.fault.ServiceNotFound error when running VLSR/VR VAMI configure

book

Article ID: 387162

calendar_today

Updated On:

Products

VMware vCenter Server 7.0 VMware Live Recovery

Issue/Introduction

Symptoms : 

  • Trying to configure a VLSR (VMware Live Site Recovery) or vSphere Replication (VR) appliance thru VAMI fails on step 1.  Platform Service Controller credentials.

    error: dr.fault.ServiceNotFound is thrown.  



  • lsdoctor script indicates there are Duplicates endpoints found

    root@TestVC [ /tmp/lsdoctor-250331 ]# python lsdoctor.py -l

        ATTENTION:  You are running a reporting function.  This doesn't make any changes to your environment.
        You can find the report and logs here: /var/log/vmware/lsdoctor

    2025-04-09T12:26:25 INFO main: You are reporting on problems found across the SSO domain in the lookup service.  This doesn't make changes.
    2025-04-09T12:26:27 INFO live_checkCerts: Checking services for trust mismatches...
    2025-04-09T12:26:27 INFO generateReport: Listing lookup service problems found in SSO domain
    2025-04-09T12:26:27 INFO generateReport: No issues detected in the lookup service entries for TestVC.example.com(VC 7.0 or CGW).
    2025-04-09T12:26:27 ERROR generateReport: default-first-site\TestVC.example.com (VC 7.0 or CGW) found Duplicates Found: Ignore if this is the PSC HA VIP.  Otherwise, you must unregister the extra endpoints.
    2025-04-09T12:26:27 INFO generateReport: Report generated:  /var/log/vmware/lsdoctor/TestVC.example.com-2025-04-09-122625.json
    root@TestVC [ /tmp/lsdoctor-250331 ]#

Environment

vSphere vCenter 7.x
VMware Live  Recovery 8.x, 9.x
vSphere Replication 8.x, 9.x

Cause

The PSC LookupService endpoints vcenterserver & cs.authorization required by these appliances when authenticating are not correctly matching on Site, Node & Owner ids within the PSC.
 
This has been known to only occur after a vCenter PSC converge operation (external to embedded).   The PSC converge operation does not maintain all the Ids , more often creating new Site Ids labeled 'default-first-site' or 'default-site' for only the vcenterserver endpoint registration, while leaving the cs.authorization unchanged.  This requires re-registering the cs.authorization service endpoint.

Therefore when a peripheral feature such as VLSR or VR try's to register with the LookupService, the operation is denied as the authentication service for the site cannot be found due to Site Id naming, hence the error serviceNotFound.

Even though the external PSC has been deprecated since 2018, any converged VC may still encounter this vcenterserver & cs.authorization  Id mismatch, but can go unnoticed (as unaffecting vCenter operations)  until such time as a VLSR/VR configure is executed, where the mismatch manifests as a configure failure.

Resolution

Workaround: If either the Site, Node & Owner Ids are confirmed to mismatch, with the SAME PSC LookupService then the service registration has to be reregistered.

1. Verify the PSC endpoints vcenterserver & cs.authorization have same Site, Node & Owner IDs:

- Snapshot the vCenter
- ssh to the vCenter appliance as 'root'.
- Run these 2 lstool.py list commands to extract vcenterserver & cs.authorization endpoint registrations.

vcenterserver

 echo "VC servers --------------|";/usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert --type vcenterserver 2>/dev/null  | grep -i  "Service ID";  echo "";  /usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert --type vcenterserver 2>/dev/null  | grep -iA9  "Service Type"


VC servers --------------|
        Service Type: vcenterserver
        Service ID: ########-####-####-b222-############
        Site ID: default-first-site
        Node ID: ########-####-####-8ae2-############
        Owner ID: vpxd-########-####-####-b364-############@vsphere.local
        Version: 7.0
        Endpoints:
                Type: com.vmware.vim.extension
                Protocol: vmomi
                URL: https://<vc-fqdn>:443/sdkTunnel


cs.authorization

echo "cs.authorization --------|";/usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert --type cs.authorization  2>/dev/null  | grep -i  "Service ID";  echo "";  /usr/lib/vmware-lookupsvc/tools/lstool.py list --url https://localhost/lookupservice/sdk --no-check-cert --type cs.authorization 2>/dev/null  | grep -iA9  "Service Type"
 

cs.authorization --------|
        Service Type: cs.authorization
        Service ID: ########-####-####-b9f4-############_authz 
        Site ID: HQ
        Node ID: ########-####-####-8ae2-############
        Owner ID: vpxd-extension-########-####-####-b364-############@vsphere.local 
        Version: 1.0
        Endpoints:
                Type: com.vmware.cis.common.resourcebundle
                Protocol: http
                URL: https://<vc-fqdn>.vsphere.local:443/invsvc/authz-resource



- If Site, Node & Owner IDs match then do NOT proceed.

   #end

- If either Site, Node & Owner ID mismatches then the cs.authorization Service ID: must  be reregistered, as the 
vcenterserver service id is deemed primary.


2. Reregister the Service endpoint manually or with lsdoctor -r


        Option 1:  R
e-register a particular site ID using the lsdoctor tool with -r option

   Example :

root@TestVC [ /tmp/lsdoctor-250331 ]# python lsdoctor.py -r

    WARNING:  This script makes permanent changes.  Before running, please take *OFFLINE* snapshots
    of all VC's and PSC's at the SAME TIME.  Failure to do so can result in PSC or VC inconsistencies.
    Logs can be found here: /var/log/vmware/lsdoctor

2025-04-09T15:37:48 INFO main:
                You have selected the Rebuild function.  This is a potentially destructive operation!
                All external solutions and 3rd party plugins that register with the lookup service will
                have to be re-registered.  For example: SRM, vSphere Replication, NSX Manager, etc.

Have you taken offline (PSCs and VCs powered down at the same time) snapshots of all nodes in the SSO domain or supported backups?[y/n]y

Provide password for [email protected]:
2025-04-09T15:38:36 INFO __init__: Established LS connection to testvc1.example.com

        Version Detected
            Deployment type: embedded
            Version: 24322018_7.0.3.02200_vcsa
        ========================

        0.  Exit
        1.  Generate a template.
        2.  Replace all services with new services.
        3.  Replace individual service.
        4.  Restore services from backup file.

        ========================

Please select an action: 2
2025-04-09T15:38:52 INFO autoRebuild: Found /tmp/lsdoctor-250331/templates/24322018_7.0.3.02200_vcsa.json

            You have selected a Rebuild function.  This is a potentially destructive operation!
            All external solutions and 3rd party plugins that register with the lookup service may
            have to be re-registered.  For example: SRM, vSphere Replication, NSX Manager, etc.

Are you sure you want to continue?[y/n]y
2025-04-09T15:38:59 INFO unregisterPnid: Service xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx has been successfully unregistered
2025-04-09T15:38:59 INFO unregisterPnid: Service xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx has been successfully unregistered
++ Lines Removed +++ 
2025-04-09T15:39:14 INFO rebuild_services: Recreating SSO service registrations...
2025-04-09T15:39:16 INFO rebuild_services: Successfully recreated modern SSO endpoints.
2025-04-09T15:39:17 INFO rebuild_services: Successfully recreated legacy SSO endpoints.

        Version Detected
            Deployment type: embedded
            Version: 24322018_7.0.3.02200_vcsa
        ========================

        0.  Exit
        1.  Generate a template.
        2.  Replace all services with new services.
        3.  Replace individual service.
        4.  Restore services from backup file.

        ========================

Please select an action: 0
2025-04-09T15:39:24 INFO menu: Exiting...
root@RCLBADRVCSRV01 [ /tmp/lsdoctor-250331 ]#

Option 2:  Manually option export/remove and reregister the PSC registration into the correct site ID      

Export the Service Id registration to a spec file.
/usr/lib/vmware-lookupsvc/tools/lstool.py get --url http://localhost:7090/lookupservice/sdk --id ########-####-####-b9f4-############_authz --no-check-cert --as-spec > /tmp/spec.txt

Unregister the service registration
 /usr/lib/vmware-lookupsvc/tools/lstool.py unregister --url http://localhost:7090/lookupservice/sdk --no-check-cert --user [email protected] --password "SSO_administrator_password" --id ########-####-####-b9f4-############_authz

Register Service registrations to Default Site
/usr/lib/vmware-lookupsvc/tools/lstool.py register --spec /tmp/spec.txt --url https://localhost:7090/lookupservice/sdk --user [email protected] --password "SSO_administrator_password" --id "########-####-####-b9f4-############_authz" --no-check-cert

- Run the verification cmd from step 1. to confirm registration Ids.

-  Execute the VLR/VR Configure operation again.

Additional Information