Unable to update the configuration drift bundle on SDDC Manager in VCF 4.X : Failed to run clean VUM DB.
search cancel

Unable to update the configuration drift bundle on SDDC Manager in VCF 4.X : Failed to run clean VUM DB.

book

Article ID: 314631

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

The purpose of this KB is to fix this upgrade error on the SDDC Configuration drift bundle, and get the SDDC Manager completely upgraded.

Symptoms:

Application of the Configuration drift bundle update for the SDDC Manager on VCF 4.X fails with the error: "Failed to run clean VUM DB"

On the SDDC Manager logs: /var/log/vmware/lcm/lcm.log:


Note: This log provides the upgrade ID details. The upgrade ID can be used to check the third-party/migration logs. As shown in the screenshot, the path for the logs and the upgrade ID is displayed.

 

In sddcmanager_migration_upgrade.log located in path /var/log/vmware/vcf/lcm/thirdparty/upgrades/<upgrade id>/sddcmanager-migration-app/logs/sddcmaanger_migration_upgrade.log

YYYY-MM-DDTHH:MM:SS.713+0000 ERROR [vcf_migration,0000000000000000,0000] [c.v.e.s.o.model.error.ErrorFactory,pool-5-thread-15]  [6469SC] FAILED_TO_RUN_CLEAN_VUM_DB Failed to run clean VUM DB.
com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to run clean VUM DB.
        at com.vmware.vcf.migration.actions.workarounds.CleanVumDBAction.execute(CleanVumDBAction.java:315)
        at com.vmware.vcf.migration.actions.workarounds.CleanVumDBAction.execute(CleanVumDBAction.java:40)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionState.invoke(FsmActionState.java:62)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:159)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:144)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.invokeMethod(ProcessingTaskSubscriber.java:400)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.processTask(ProcessingTaskSubscriber.java:520)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.accept(ProcessingTaskSubscriber.java:124)
        at sun.reflect.GeneratedMethodAccessor598.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:87)
        at com.google.common.eventbus.Subscriber$1.run(Subscriber.java:72)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Caused by: com.vmware.vapi.std.errors.ServiceUnavailable: ServiceUnavailable (com.vmware.vapi.std.errors.service_unavailable) => {
    messages = [LocalizableMessage (com.vmware.vapi.std.localizable_message) => {
    id = com.vmware.vapi.endpoint.cis.ServiceUnavailable,
    defaultMessage = Service unavailable.,
    args = [],
    params = <null>,
    localized = <null>


On the Management vCenter server,

In /var/log/vmware/applmgmt/applmgmt.log:

YYYY-MM-DDTHH:MM:SS PM CEST [2205]ERROR:vmware.appliance.vapi.auth:Could not parse HOK Token
Traceback (most recent call last):
  File "/usr/lib/applmgmt/vapi/py/vmware/appliance/vapi/auth.py", line 243, in authenticate
    username = token.username
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 486, in username
    return self.get_name_id().value
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 939, in get_name_id
    '//saml2:Subject/saml2:NameID', self.reference)
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 477, in reference
    self.validate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 1169, in validate
    reference = super(HolderOfKeyToken, self).validate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 505, in validate
    signing_chain = self.validate_certificate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 685, in validate_certificate
    'One or more certificates cannot be verified.')
vmware.appliance.extensions.authentication.authentication_sso.AuthenticationError: One or more certificates cannot be verified. 
YYYY-MM-DDTHH:MM:SS PM CEST [2205]ERROR:vmware.appliance.vapi.auth:Could not parse HOK Token
Traceback (most recent call last):
  File "/usr/lib/applmgmt/vapi/py/vmware/appliance/vapi/auth.py", line 243, in authenticate
    username = token.username
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 486, in username
    return self.get_name_id().value
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 939, in get_name_id
    '//saml2:Subject/saml2:NameID', self.reference)
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 477, in reference
    self.validate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 1169, in validate
    reference = super(HolderOfKeyToken, self).validate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 505, in validate
    signing_chain = self.validate_certificate()
  File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 685, in validate_certificate
    'One or more certificates cannot be verified.')
vmware.appliance.extensions.authentication.authentication_sso.AuthenticationError: One or more certificates cannot be verified.

Environment

VMware Cloud Foundation 4.x

Cause

  • As a part of the Configuration drift bundle, SDDC manager performs an operation on the VUM DB to clean up some entries via an API call.
  • To make that API/SDK connection, SDDC manager has to leverage an API on the vCenter that passes through the applmgmt service.
  • The applmgmt service on the vCenter was failing due to an issue with the STS certificate in the VMDIR DB - likely due to unexpected entries in the STS certificate.

Resolution

1. Create offline snapshots of all vCenters.

2.Renew the STS certificate on the Management vCenter. ("Signing certificate is not valid" or "No healthy upstream" error in vCenter Server Appliance)

3. Reset the solution user certificates on the Management vCenter using Option 6 in the Certificate Manager. (Regenerate vSphere 6.x, 7.x, and 8.0 certificates using self-signed VMCA)
(That should restart all services on the vCenter as well)

4. Restart the services on all remaining vCenters within the SSO.
Use the command : 

service-control --stop --all && service-control --start --all

5. If the services on any vCenter fail to start at this stage, reset the solution user certificates on that vCenter (see Step 3 for reference).

6. Initiate the Configuration Drift Bundle update again via the SDDC Manager UI.

Additional Information

Impact/Risks:

MODERATE: The process involves resetting the STS certificate, which is a change in the VMDIR DB. Solution user certificates on one or more vCenters may also be reset. It is required to take offline snapshots of all vCenters in the SSO. Do not proceed without offline snapshots of all vCenters.