The purpose of this KB is to fix this upgrade error on the SDDC Configuration drift bundle, and get the SDDC Manager completely upgraded.
Symptoms:
Application of the Configuration drift bundle update for the SDDC Manager on VCF 4.X fails with the error: "Failed to run clean VUM DB"
On the SDDC Manager logs: /var/log/vmware/lcm/lcm.log:
Note: This log provides the upgrade ID details. The upgrade ID can be used to check the third-party/migration logs. As shown in the screenshot, the path for the logs and the upgrade ID is displayed.
In sddcmanager_migration_upgrade.log located in path /var/log/vmware/vcf/lcm/thirdparty/upgrades/<upgrade id>/sddcmanager-migration-app/logs/sddcmaanger_migration_upgrade.log
YYYY-MM-DDTHH:MM:SS.713+0000 ERROR [vcf_migration,0000000000000000,0000] [c.v.e.s.o.model.error.ErrorFactory,pool-5-thread-15] [6469SC] FAILED_TO_RUN_CLEAN_VUM_DB Failed to run clean VUM DB.
com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to run clean VUM DB.
at com.vmware.vcf.migration.actions.workarounds.CleanVumDBAction.execute(CleanVumDBAction.java:315)
at com.vmware.vcf.migration.actions.workarounds.CleanVumDBAction.execute(CleanVumDBAction.java:40)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionState.invoke(FsmActionState.java:62)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:159)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:144)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.invokeMethod(ProcessingTaskSubscriber.java:400)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.processTask(ProcessingTaskSubscriber.java:520)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.accept(ProcessingTaskSubscriber.java:124)
at sun.reflect.GeneratedMethodAccessor598.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:87)
at com.google.common.eventbus.Subscriber$1.run(Subscriber.java:72)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.vmware.vapi.std.errors.ServiceUnavailable: ServiceUnavailable (com.vmware.vapi.std.errors.service_unavailable) => {
messages = [LocalizableMessage (com.vmware.vapi.std.localizable_message) => {
id = com.vmware.vapi.endpoint.cis.ServiceUnavailable,
defaultMessage = Service unavailable.,
args = [],
params = <null>,
localized = <null>
On the Management vCenter server,
In /var/log/vmware/applmgmt/applmgmt.log:
YYYY-MM-DDTHH:MM:SS PM CEST [2205]ERROR:vmware.appliance.vapi.auth:Could not parse HOK Token
Traceback (most recent call last):
File "/usr/lib/applmgmt/vapi/py/vmware/appliance/vapi/auth.py", line 243, in authenticate
username = token.username
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 486, in username
return self.get_name_id().value
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 939, in get_name_id
'//saml2:Subject/saml2:NameID', self.reference)
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 477, in reference
self.validate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 1169, in validate
reference = super(HolderOfKeyToken, self).validate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 505, in validate
signing_chain = self.validate_certificate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 685, in validate_certificate
'One or more certificates cannot be verified.')
vmware.appliance.extensions.authentication.authentication_sso.AuthenticationError: One or more certificates cannot be verified.
YYYY-MM-DDTHH:MM:SS PM CEST [2205]ERROR:vmware.appliance.vapi.auth:Could not parse HOK Token
Traceback (most recent call last):
File "/usr/lib/applmgmt/vapi/py/vmware/appliance/vapi/auth.py", line 243, in authenticate
username = token.username
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 486, in username
return self.get_name_id().value
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 939, in get_name_id
'//saml2:Subject/saml2:NameID', self.reference)
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 477, in reference
self.validate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 1169, in validate
reference = super(HolderOfKeyToken, self).validate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 505, in validate
signing_chain = self.validate_certificate()
File "/usr/lib/applmgmt/lib/extensions/py/vmware/appliance/extensions/authentication/authentication_sso.py", line 685, in validate_certificate
'One or more certificates cannot be verified.')
vmware.appliance.extensions.authentication.authentication_sso.AuthenticationError: One or more certificates cannot be verified.
1. Create offline snapshots of all vCenters.
2.Renew the STS certificate on the Management vCenter. ("Signing certificate is not valid" or "No healthy upstream" error in vCenter Server Appliance)
3. Reset the solution user certificates on the Management vCenter using Option 6 in the Certificate Manager. (Regenerate vSphere 6.x, 7.x, and 8.0 certificates using self-signed VMCA)
(That should restart all services on the vCenter as well)
4. Restart the services on all remaining vCenters within the SSO.
Use the command :
service-control --stop --all && service-control --start --all
5. If the services on any vCenter fail to start at this stage, reset the solution user certificates on that vCenter (see Step 3 for reference).
6. Initiate the Configuration Drift Bundle update again via the SDDC Manager UI.
MODERATE: The process involves resetting the STS certificate, which is a change in the VMDIR DB. Solution user certificates on one or more vCenters may also be reset. It is required to take offline snapshots of all vCenters in the SSO. Do not proceed without offline snapshots of all vCenters.