SDDC Manager UI Fails to Load with error "Server failed to start"
search cancel

SDDC Manager UI Fails to Load with error "Server failed to start"

book

Article ID: 441871

calendar_today

Updated On:

Products

VMware SDDC Manager / VCF Installer VMware vCenter Server

Issue/Introduction

  • SDDC Manager UI was inaccessible. The landing page fails to load for  a very long time and finally gives the error: "Server failed to start. Failed to init PSC and/or Postgres Check the SDDC Manager UI logs for more details"

  • From var/log/vmware/vcf/sdd-manager-ui-app/sddcserverManager.log it complains about failing to ssh to PSC. 

YYYY-MM-DDTHH:MM:SS.146+0000 ERROR [24f1c8d6fc4a4fa6, b713783e0f80466f] [services/pscUtils.js, init-pscs, attemptPSCInit:67] Caught error from await primaryPscInit
YYYY-MM-DDTHH:MM:SS.147+0000 WARN [24f1c8d6fc4a4fa6, b713783e0f80466f] [services/pscUtils.js, init-pscs, attemptPSCInitWithRetry:111]
100.109: VError: PSC Initilization attempt "8" failed: Failed to initiate PSC: Primary psc init failed and failover psc init also failed: Remote ssh command timed out
    at Object.initializationPscError (/opt/vmware/vcf/sddc-manager-ui-app/server/src/errors/VCFError.js:104:5)
    at attemptPSCInitWithRetry (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:104:46)
Error Info: {"retryCount":8,"status":403,"errorModule":100,"errorCode":109} caused by:
100.108: VError: Failed to initiate PSC: Primary psc init failed and failover psc init also failed: Remote ssh command timed out
    at Object.initiatePscError (/opt/vmware/vcf/sddc-manager-ui-app/server/src/errors/VCFError.js:104:5)
    at attemptPSCInit (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:72:26)
    at async attemptPSCInitWithRetry (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:99:17)
Error Info: {"errorModule":100,"errorCode":108} caused by:
100.107: VError: Primary psc init failed and failover psc init also failed: Remote ssh command timed out
    at Object.primaryPscInitError (/opt/vmware/vcf/sddc-manager-ui-app/server/src/errors/VCFError.js:104:5)
    at attemptPSCInit (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:68:30)
    at async attemptPSCInitWithRetry (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:99:17)
Error Info: {"errorModule":100,"errorCode":107} caused by:
Error: Remote ssh command timed out
    at /opt/vmware/vcf/sddc-manager-ui-app/server/src/services/sso-initialization.js:275:65
    at new Promise (<anonymous>)
    at init (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/sso-initialization.js:274:24)
    at Object.reset (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/sso-initialization.js:294:13)
    at initializeSSO (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/sso-initialization.js:54:24)
    at /opt/vmware/vcf/sddc-manager-ui-app/server/src/services/logging/opentrace.js:423:32
    at Namespace.run (/opt/vmware/vcf/sddc-manager-ui-app/server/node_modules/cls-hooked/context.js:97:5)
    at Tracer.runInSpan (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/logging/opentrace.js:421:25)
    at Tracer.runInNewSpan (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/logging/opentrace.js:463:21)
    at runInNewSpan (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/logging/opentrace.js:593:41)
All Errors Info: {"retryCount":8,"status":403}

  • SSH connectivity from the SDDC to the vCenter has been successfully verified.

         A port check using curl -v telnet://<vCenter FQDN>:22 confirmed an active connection and a subsequent SSH session to the vCenter was established without issue.

        SDDC Manager UI Fails to Load with error "VMware Cloud Foundation is initializing" Validation was completed; however, vCenter was active in the database and the ssh_host_key_type field was not blank.

 

  • Simultaneously, the Commonsvcs logs indicate that the PSC might be in an error state:

     var/log/vmware/vcf/commonsvcs/commonsvcs.log

YYYY-MM-DDTHH:MM:SS.406+0000 INFO  [common,d61893d1b309433a,7d1f] [c.v.e.s.i.s.PscInventoryServiceEmbeddedImpl,http-nio-127.0.0.1-7100-exec-1] VCenters version 7 or up found in inventory: []
YYYY-MM-DDTHH:MM:SS.407+0000 INFO  [common,d61893d1b309433a,7d1f] [c.v.e.s.i.s.VcenterInventoryServiceImpl,http-nio-127.0.0.1-7100-exec-1] Get all Vcenters
YYYY-MM-DDTHH:MM:SS.409+0000 INFO  [common,d61893d1b309433a,7d1f] [c.v.e.s.i.s.PscInventoryServiceEmbeddedImpl,http-nio-127.0.0.1-7100-exec-1] Returned PSCs: [Psc(domainId=########-####-####-####-#############, vmName=####_VCENTER, vmManagementIpAddress=###.##.##.##, vmHostname=############, isReplica=false, port=443, ssoDomain=vsphere.local, subDomain=####.##.##, sshHostKeyType=null, sshHostKey=null, bundleRepoDatastore=lcm-bundle-repo, datastoreName=####_######_####_BOOT, status=ERROR, version=8.0.3.00300-24305161, deploymentType=EMBEDDED)]
YYYY-MM-DDTHH:MM:SS.431+0000 INFO  [common,d61893d1b309433a,7d1f] [c.v.e.s.i.s.SddcManagerControllerInventoryServiceImpl,http-nio-127.0.0.1-7100-exec-1] Get Sddc Controller

 

  • Checked the PSC status in the platform database, as expected, it was in an error state:

      psql -h localhost -U postgres -d platform
           select * from psc;  

       

Environment

SDDC Manager 5.x

VMware vCenter Server

Cause

The SDDC Manager marked the Platform Services Controller (PSC) instance status as 'ERROR' due to an authentication failure with the vCenter root or service accounts. This failure occurred because the [email protected] password was reset outside of the SDDC Manager, resulting in an expired password or locked account.
Consequently, the SDDC Manager was prevented from establishing the SSH connection required to initialize the PSC.

Resolution

Resolution Steps:


Crucial Reminder:
Ensure you have an offline snapshot of the SDDC Manager appliance before making manual modifications to the PostgreSQL database.

Step 1: Access SDDC Manager and the Database

  1. SSH into the SDDC Manager appliance using the vcf user credentials.

  2. Switch to the root user by executing:

    su -
  3. Connect to the PostgreSQL database (specifically the platform database) by running:

    psql -h localhost -U postgres -d platform

    (Note: You will now see the PostgreSQL prompt platform=# indicating you are successfully connected).


Step 2: Fetch the PSC ID

To find the correct ID for the PSC currently stuck in the ERROR state, query the psc table.

  1. Execute the following SQL query:

    SELECT id, status FROM psc WHERE status = 'ERROR';
  2. Copy the ID returned in the output. It will look like a standard UUID (e.g., 12345678-abcd-1234-efgh-1234567890ab).


Step 3: Modify the Database


Now that you have the exact ID, you can run the update command you provided.

  1. Execute the UPDATE query, replacing the placeholder with the ID you just copied:

    UPDATE psc SET status='ACTIVE' WHERE id='<YOUR-COPIED-ID>';
  2. Verify the change by running the SELECT query again to ensure the status now reads ACTIVE:

    SELECT id, status FROM psc WHERE id='<YOUR-COPIED-ID>';
  3. Exit the PostgreSQL database by typing:

    \q


Step 4: Restart SDDC Manager Services

Finally, restart the services so SDDC Manager can pick up the database changes and restore UI access.

  1. Run the restart script:

    /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

Allow a few minutes for all the microservices to initialize fully. Once the script completes and services are up, you should be able to log back into the SDDC Manager UI without issue.