Firstboot Error: Failed to start Workload Control Plane Service
search cancel

Firstboot Error: Failed to start Workload Control Plane Service

book

Article ID: 309409

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • Upgrading from vCenter 6.5 / 6.7U3 to 7.x with this error on Stage 2 of the upgrade -  
    • An error occurred while starting service 'wcp'. Failed to start Workload Control Plane Service.
  • You may see an error like:

An error occurred while invoking external command: 'Error 166 while adding user workload_storage_management-########-####-####-####-a52f0cb11e3d to SSO group "ServiceProviderUsers": dir-cli failed, error= Specified User or Group to be added doesn't exist. 100006 ' Failed to configure Workload Control Plane.
Resolution: This is an unrecoverable error, please retry install. If you encounter this error again, please search for these symptoms in the VMware Knowledge Base for any known issues and possible resolutions. If none can be found, collect a support bundle and open a support request.

 

Environment

  • VMware vCenter Server 7.0.x
  • VMware vCenter Server 6.7.x

Cause

  • wcp.log entries - 

Starting service process with pid: 9662.
time="yyyy-mm-ddThh:mm:ss.Z" level=info msg="Set debug mode to false"
time="yyyy-mm-ddThh:mm:ss.Z" level=error msg="Encountered error when opening file /etc/vmware/wcp/.storageUser: open /etc/vmware/wcp/.storageUser: no such file or directory"
time="yyyy-mm-ddThh:mm:ss.Z" level=error msg="Failed to parse Service Account data from file /etc/vmware/wcp/.storageUser: open /etc/vmware/wcp/.storageUser: no such file or directory"
time="yyyy-mm-ddThh:mm:ss.Z" level=fatal msg="Failed to load storage service account from /etc/vmware/wcp/.storageUser, verify that firstboot succeeded"

  • The upgrade fails due to the file .storageUser not getting created

Resolution

Verification of the issue:

  • Open an SSH/Putty session to the newly created appliance.
  • Check under /etc/vmware/wcp/ if the file (.storageUser) exists (in the example below, it does exist) -
    • ls -lah .storage*
    • -rw------- 1 wcp lwisRegReader 96 Mar 23 07:27 .storageUser
  • If it does not exist, the workaround is to delete the user in SSO on the source PSC

How to fix the issue:

  • Log in to the source PSC
  • To delete the SSO user, we'll need to know the VCSA Machine ID, so run the command on the new VCSA:
    • /usr/lib/vmware-vmafd/bin/vmafd-cli get-machine-id --server-name localhost
  • The value returned should be the vc-machine-id.
  • Delete the user, substituting the correct vc-machine-id from the previous step.
  • /usr/lib/vmware-vmafd/bin/dir-cli user delete --account 'workload_storage_management-<vc-machine-id>'
  • Alternatively, can you log into the PSC with Jxplorer, and under 'ServicePrincipals', you can delete 'wcp' and 'workload_storage_management' entries for non-upgraded nodes, as these should not be present until 7.x.
  • Their existence would indicate a previous upgrade attempt with an incorrect or partial rollback.
  • Proceed with the upgrade.