Upgrade to vCenter Server 7.0 fails with "WCP service installation failed"
search cancel

Upgrade to vCenter Server 7.0 fails with "WCP service installation failed"

book

Article ID: 326206

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • in the WCP firstboot log (/var/log/firstboot/wcp-firstboot.py_xxxxx_stdout.log) you see this error:
    2021-04-09T10:12:41.751Z ERROR wcp-firstboot Unexpected error creating ServiceAccount {messages : [LocalizableMessage(id='com.vmware.vcenter.svcaccountmgmt.error', default_message='Exception found (Internal Server Error, VMware directory error[9127])', args=['Internal Server Error, VMware directory error[9127]'], params=None, localized=None)], data : None, error_type : None}
    2021-04-09T10:12:41.751Z ERROR wcp-firstboot Failed to create service account for workload storage
    Traceback (most recent call last):
    File "/usr/lib/vmware-wcp/py-modules/wcpconfigure.py", line 300, in _create_storage_user
    password = svcacctmgmt_client.create_svc_account(self._user_name)
    File "/usr/lib/vmware-wcp/py-modules/svcacctmgmt.py", line 90, in create_svc_account
    raise er
    File "/usr/lib/vmware-wcp/py-modules/svcacctmgmt.py", line 84, in create_svc_account
    svcacct_pwd_out = svcacct_client.create(create_spec)
    File "/usr/lib/vmware-wcp/py-modules/vapi-bindings/com/vmware/vcenter/svcaccountmgmt_client.py", line 368, in create
    'create_spec': create_spec,
    File "/usr/lib/vmware-vapi/lib/python/vapi_runtime-2.100.0-py2.py3-none-any.whl/vmware/vapi/bindings/stub.py", line 345, in _invoke
    return self._api_interface.native_invoke(ctx, _method_name, kwargs)
    File "/usr/lib/vmware-vapi/lib/python/vapi_runtime-2.100.0-py2.py3-none-any.whl/vmware/vapi/bindings/stub.py", line 298, in native_invoke
    self._rest_converter_mode)
    com.vmware.vapi.std.errors_client.Error: {messages : [LocalizableMessage(id='com.vmware.vcenter.svcaccountmgmt.error', default_message='Exception found (Internal Server Error, VMware directory error[9127])', args=['Internal Server Error, VMware directory error[9127]'], params=None, localized=None)], data : None, error_type : None}
  • when looking in the VMDirectory (i.e. using Jxplorer) under "Servers" and under "Domain Controllers", instead of the actual vCenter Server FQDN, a different one is listed:
    cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local
    cn=localhost.localdom,ou=Domain Controllers,dc=vsphere,dc=local


Environment

VMware vCenter Server 7.0.x

Cause

This issue has been seen for vCenter Servers, for which the FQDN was changed after deployment, either because of an actual name change, or when vCenter was originally deployed without providing a DNS name.

Resolution

Note: please ensure that you have created a fresh backup or an offline snapshot of the vCenter Server appliance before attempting the steps below. If the vCenter Server you plan on running these steps on is part of a Linked Mode setup, please be aware that you need to create the backup or offline snapshots for every node.

To fix this issue, the following entries in the VMDirectory need to be replaced:

cn=<old wrong vCenter FQDN>,ou=Domain Controllers,dc=vsphere,dc=local
cn=Replication Agreements,cn=<old wrong vCenter FQDN>,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local
cn=<old wrong vCenter FQDN>,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local

 

Instead of <old wrong vCenter FQDN> in your vCenter there will be the actual old FQDN of the vCenter, or if vCenter was originally deployed with IP address only, it will be localhost.localdom. In the steps below localhost.localdom is used an example.
As part of the process you also need to change the password for the default SSO administrator account [email protected] to a temporary different one. 
 

Part 1 - collecting the required information

To fix the issue, some information is required, which can best be collected live in the environment.
While you can find some of the data needed for this in the vCenter logs as well, certain information like the dcAccountPassword for example is not being collected as part of a log bundle or in an LDIF export from the VMDir.

You will need the following 7 values:

  • current (correct) vCenter FQDN
  • dcAccountPassword
  • vmwLDUGuid
  • vmwMachineGUID
  • siteGUID
  • vmwPlatformServicesControllerVersion
  • InvocationId


How to get the FQDN should be pretty clear, but for the example below I will use vcsa.domain.local. As for the other ones:
 

1.1. dcAccountPassword, vmwLDUGuid, vmwMachineGUID and SiteGuid

There are multiple ways to get the values for these 4, but the easiest way is to look in the Likewise registry in HKEY_THIS_MACHINE\Services\vmdir. You can run the following command to get all values in this branch:

# /opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\vmdir]' | grep -E "dcAccountPassword|LduGuid|MachineGuid|SiteGuid"

The Output will look like this:

# /opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\vmdir]' | grep -E "dcAccountPassword|LduGuid|MachineGuid|SiteGuid"
+  "dcAccountPassword"    REG_SZ          "xtJ_NQ}+lsT&4/Fw?}e@"
+  "LduGuid"              REG_SZ          "82e32d15-1357-450a-abc2-b5b13788e612"
+  "MachineGuid"          REG_SZ          "36944e03-f965-487c-881e-b3e05786df54"
+  "SiteGuid"             REG_SZ          "19ec713e-2bc3-42b0-ac85-f488da40fc6a"


You can also collect the 3 IDs (but not the dcAccountPassword) using the vmafd-cli tool:

# /usr/lib/vmware-vmafd/bin/vmafd-cli get-ldu --server-name localhost

# /usr/lib/vmware-vmafd/bin/vmafd-cli get-machine-id --server-name localhost

# /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-guid --server-name localhost

 

1.2. vmwPlatformServicesControllerVersion

To get the PSC Version, run this command (replace <sso-password> with the correct password for the SSO administrator account):

# /opt/likewise/bin/ldapsearch -b "cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local" -D "cn=Administrator,cn=Users,dc=vsphere,dc=local" -w '<sso-password>' | grep vmwPlatformServicesControllerVersion

The output will look like this:

# /opt/likewise/bin/ldapsearch -b "cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local" -D "cn=Administrator,cn=Users,dc=vsphere,dc=local" -w '<sso-password>' | grep vmwPlatformServicesControllerVersion
vmwPlatformServicesControllerVersion: 6.7.0

In this example the PSC version is 6.7.0
 

1.3. InvocationId

Again, there are 2 ways to find this.
First one is to once more query the logs. We mention the InvocationID in vmdird-syslog, so we can grep for it:

$ grep InvocationID vmdird-syslog.log
2021-04-09T10:02:45.696447+00:00 info vmdird t@139654174050112: Server ID (1), InvocationID (909bbff5-902c-4247-8c56-c9f5baaa9c9c)
2021-04-09T10:02:52.208803+00:00 info vmdird t@139948118660928: Server ID (1), InvocationID (909bbff5-902c-4247-8c56-c9f5baaa9c9c)


The second option is to look in VMDir once more, this time in
cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local (Again replace <sso-password> with the correct password for the SSO administrator account):

# /opt/likewise/bin/ldapsearch -b "cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local" -D "cn=Administrator,cn=Users,dc=vsphere,dc=local" -w '<sso-password>' | grep invocationId

The output will look like this: 

# /opt/likewise/bin/ldapsearch -b "cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local" -D "cn=Administrator,cn=Users,dc=vsphere,dc=local" -w '<sso-password>' | grep invocationId
invocationId: 909bbff5-902c-4247-8c56-c9f5baaa9c9c

In this example the InvocationId is 909bbff5-902c-4247-8c56-c9f5baaa9c9c


So now we have all the information we need:

OptionValue
current (correct) vCenter FQDNvcsa.domain.local
dcAccountPasswordxtJ_NQ}+lsT&4/Fw?}e@
vmwLDUGuid82e32d15-1357-450a-abc2-b5b13788e612
vmwMachineGUID36944e03-f965-487c-881e-b3e05786df54
SiteGuid19ec713e-2bc3-42b0-ac85-f488da40fc6a
vmwPlatformServicesControllerVersion6.7.0
InvocationId909bbff5-902c-4247-8c56-c9f5baaa9c9c
 

 

2. Creating the LDIF files for the replacement

The best way to insert the required changes in VMDir is to use specifically created LDIF files which contain the information we want to add or remove.You best prepare these files offline (without the pressure of a running remote session).

We will need 3 of these, 2 to add information and a final one to remove the faulty FQDN entries. We will call them add1.ldif, add2.ldif and remove.ldif.
this is the content for each of them (the information printed in italic needs to be replaced with the data collected in step 1). Also, when creating the files, please ensure that none of the lines in them have any additional space characters at their end. Empty lines should only be empty lines (just a line break).

add1.ldif

version: 1
dn: cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local
vmwLDUGuid: 82e32d15-1357-450a-abc2-b5b13788e612
objectClass: computer
objectClass: user
objectClass: organizationalPerson
objectClass: person
objectClass: top
cn: vcsa.domain.local
sAMAccountName: vcsa.domain.local
siteGUID: 19ec713e-2bc3-42b0-ac85-f488da40fc6a
userPrincipalName: vcsa.domain.local@VSPHERE.LOCAL
vmwMachineGUID: 36944e03-f965-487c-881e-b3e05786df54
vmwPlatformServicesControllerVersion: 6.7.0
userpassword: xtJ_NQ}+lsT&4/Fw?}e@

version: 1
dn: cn=vcsa.domain.local,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local
objectClass: vmwDirServer
objectClass: top
cn: vcsa.domain.local
replInterval: 30
replPageSize: 1000
invocationId: 909bbff5-902c-4247-8c56-c9f5baaa9c9c
serverId: 2

dn: cn=Replication Agreements,cn=vcsa.domain.local,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local
objectClass: container
objectClass: top
cn: Replication Agreements

 

add2.ldif

dn: cn=DCAdmins,cn=Builtin,dc=vsphere,dc=local
changetype: modify
add: member
member: cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local

dn: cn=vsphere.local,cn=IdentityProviders,cn=vsphere.local,cn=Tenants,cn=IdentityManager,cn=Services,dc=vsphere,dc=local
changetype: modify
replace: vmwSTSUserName
vmwSTSUserName: cn=vcsa.domain.local@vsphere.local

dn: cn=DSE Root
changetype: modify
replace: serverName
serverName: cn=vcsa.domain.local,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local

dn: cn=DSE Root
changetype: modify
replace: vmwDCAccountDN
vmwDCAccountDN: cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local

dn: cn=DSE Root
changetype: modify
replace: vmwDCAccountUPN
vmwDCAccountUPN: vcsa.domain.local@VSPHERE.LOCAL


remove.ldif

cn=localhost.localdom,ou=Domain Controllers,dc=vsphere,dc=local
cn=Replication Agreements,cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local
cn=localhost.localdom,cn=Servers,cn=local,cn=Sites,cn=Configuration,dc=vsphere,dc=local


 

3. The actual replacement

Now that you have created the 3 LDIF files, you can use them. Again, please ensure that you have a fresh backup or offline-snapshot of the vCenter you are going to do this on, as well as of any of its Linked Mode partners.

Start by copying the 3 files to /tmp/ in the vCenter Server Appliance.
Then run the following commands, while replacing <sso-admin-password> with your current SSO admin password. This is not the one you have provided in add2.ldif, but the one you are usually using.
Also replace vcsa.domain.local with the current FQDN of your vCenter Server Appliance:
# /opt/likewise/bin/ldapadd -x -D cn=Administrator,cn=Users,dc=vsphere,dc=local -w "<sso-admin-password>" -f /tmp/add1.ldif
 
# /usr/lib/vmware-vmdir/bin/vdcsrp -D [email protected] -W 'xtJ_NQ}+lsT&4/Fw?}e@'
 
# /opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\vmdir]' "dcAccountDN" "cn=vcsa.domain.local,ou=Domain Controllers,dc=vsphere,dc=local"
 
# /opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\vmdir]' "dcAccount" "vcsa.domain.local"
 
# /opt/likewise/bin/ldapmodify -x -D cn=Administrator,cn=Users,dc=vsphere,dc=local -w "<sso-admin-password>" -f /tmp/add2.ldif
 
# /opt/likewise/bin/ldapdelete -x -D cn=Administrator,cn=Users,dc=vsphere,dc=local -w "<sso-admin-password>" -f /tmp/remove.ldif

As soon as all of the commands have been successfully run, restart the vCenter services with:
# service-control --stop --all && service-control --start --all

At this point all that is left is rebuilding the service registrations.
The way to do this is by using the lsdoctor tool, which you can download from https://ikb.vmware.com/s/article/80469

For the rebuild, use "lsdoctor --rebuild" (or -r).

Download the tool into the appliance as described in the article, then run the tool with providing -r as option. This will present you with a multiple-choice menu.

In the menu chose the option "2. Replace all services with new services.".

That's it, as soon as the last step has been finished, verify that vCenter is up and running. Once this has been confirmed, proceed to update the vCenter Server Appliance to 7.x