vSphere UI startup Failure Post-ELM Restored from backup
search cancel

vSphere UI startup Failure Post-ELM Restored from backup

book

Article ID: 427340

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • After successfully restoring a vCenter Server Appliance (VCSA) from a file-based backup, the vSphere Client (HTML5) UI remains inaccessible. While the appliance may respond to pings and SSH, the vsphere-ui service fails to initialize. This issue typically occurs in environments configured with Enhanced Linked Mode (ELM) where the replication partner is missing or unreachable.

 

  • Log Evidence: in the/var/log/vmware/vmon/vmon.log we see entries similar to below 
    In(05) host-2520 Received start request for vsphere-ui
    Wa(03) host-2520 Deleting the direct response route created for vsphere-ui
    In(05) host-2520 <static-ui> Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmon-direct-response-route.py delete
    In(05) host-2520 <vsphere-ui-prestart> Constructed command: /usr/lib/vmware-vsphere-ui/firstboot/vsphere_ui_prestart.py
    In(05) host-2520 Direct response route command failed.
    Wa(03) host-2520 <vsphere-ui> Service pre-start command's stderr: Traceback (most recent call last):
    Wa(03)+ host-2520   File "/usr/lib/vmware-vsphere-ui/firstboot/vsphere_ui_prestart.py", line 111, in <module>
    Wa(03)+ host-2520     raise ex
    Wa(03)+ host-2520   File "/usr/lib/vmware-vsphere-ui/firstboot/vsphere_ui_prestart.py", line 108, in <module>
    Wa(03)+ host-2520     create_and_assign_vsphere_client_solution_user_role(VSPHERE_UI_FIRSTBOOT_CONFIG_DIR)
    Wa(03)+ host-2520   File "/usr/lib/vmware-vsphere-ui/firstboot/solution_user_permission_utils.py", line 59, in create_and_assign_vsphere_client_solution_user_role
    Wa(03)+ host-2520     _add_privileges(vsphereclient_privileges_xml_file)
    Wa(03)+ host-2520   File "/usr/lib/vmware-vsphere-ui/firstboot/solution_user_permission_utils.py", line 140, in _add_privileges

Environment

VMware vCenter Server 

Cause

  • The vSphere UI service fails during its pre-start check because it cannot synchronize or validate global permissions and roles across the Single Sign-On (SSO) domain.
  • As seen in the vmon.log, the vsphere_ui_prestart.py script attempts to call create_and_assign_vsphere_client_solution_user_role. In an Enhanced Linked Mode setup, the SSO database (VMDir) expects to communicate with its replication partner to verify existing roles. If the partner vCenter has not been deployed or is offline, the pre-start script encounters a traceback error and terminates the service start-up sequence.

Resolution

 

To resolve this issue, the Enhanced Linked Mode topology requirement must be met, because the restored vCenter is looking for its replication partner to validate SSO data, the partner vCenter must be present.

Notes: If a partner vCenter backup is missing, perform fresh backup of the healthy partner vCenter before attempting to restore from backup to prevent ELM failure.

 

Method 1: Deploying of Partner vCenter to check, if replication becomes intact and vSphere-ui service comes up.

  1. Deploy the Partner vCenter: Ensure that the second vCenter (the replication partner) is deployed from the backup
  2. Verify Replication Status: Once the partner vCenter is online, log in via SSH on both the vCenter and run the following command to check the partner status:
    /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator -w <password>
  3. You should have output like below, Host available and Status available should be "Yes" , my last changes and partner change number should be same and identical, Partner should be "0" changes behind on both the vCenter SSH                  
    root@Test-vCenter [ ~ ]# /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator
    password:
    Partner: #######vCenter 
    Host available:   Yes
    Status available: Yes
    My last change number: ######
    Partner has seen my change number: ######
    Partner is 0 changes behind.
    
  4. If the output above matches on both the vCenter, that mean replication is intact. Try checking vsphere-ui service status.
    service-control --status vsphere-ui
  5. if the vSphere-ui service is down, try manually restarting it using command below     
    service-control --start vsphere-ui

     6. If the restored vCenter is still failing because it’s looking for a partner that is technically "up" but logically corrupted or not functional, then follow Method 2                                                                                                                                                                     

 

Method 2: Decommission of the Partner vCenter and repointing it to healthy node (PSC)

  1. Manually "Unregister" the partner vCenter using the steps below, This has to be run on Partner vCenter that is deployed lately:                                                                     
    cmsso-util unregister --node-pnid <Partner_FQDN> --username [email protected]
  2. Once the Partner vCenter entry is removed, then manually repoint it to the Healthy Partner (PSC) using the command below: use healthy Partner as your Primary vCenter FQDN                                                                
    cmsso-util repoint --server-fqdn <Healthy_Partner_FQDN> --user administrator --password <Password>
  3. After the Partner vCenter is repointed to the Primary vCenter node, restart all the vCenter Services with the command below                                                                      
    service-control --stop --all && service-control --start --all
  4. Check if all the services are back online including vSphere ui service, check both Primary and Partner vCenter ui client is accessible.