SDDC update from 5.2.1.0 to 5.2.1.1 or 5.2.1.2 fails on "SDDC Manager Deployment Drift" with an error message as shown below
As part of SDDC update workflow, auto reboot for SDDC manager is initiated. /var/log/vmware/vcf/commonsvcs/commonsvcs.log report, in this example, failure to start 'domain manager' service
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [com.zaxxer.hikari.HikariDataSource,SpringApplicationShutdownHook] HikariPool-1 - Shutdown initiated...
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [com.zaxxer.hikari.HikariDataSource,SpringApplicationShutdownHook] HikariPool-1 - Shutdown completed.
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [c.v.e.s.c.c.l.l.LogLayoutWithHeader,main] THIS LOG FILE IS MANAGED BY SDDC MANAGER
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [o.h.validator.internal.util.Version,background-preinit] HV000001: Hibernate Validator 8.0.0.Final
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [c.v.e.s.c.c.u.ComponentUpgradeRunner,main] Checking if a component upgrade is needed
YYYY-MM-DD HH:MIN INFO [common,0000000000000000,0000] [c.v.e.s.i.s.VcfServiceInventoryServiceImpl,main] Get all VcfServices
YYYY-MM-DD HH:MIN ERROR [common,0000000000000000,0000] [c.v.e.s.i.s.VcfServiceInventoryServiceImpl,main] Error while trying to retrieve service http://127.0.0.1/domainmanager/about status, 502 Bad Gateway: "<html><EOL><EOL><head><title>502 Bad Gateway</title></head><EOL><EOL><body><EOL><EOL><center><h1>502 Bad Gateway</h1></center><EOL><EOL><hr><center>nginx</center><EOL><EOL></body><EOL><EOL></html><EOL><EOL>"
/var/log/vmware/vcf/domainmanager/domainmanager.log reports service failure with an error, 'Failed to update VCF Services and Photon rpms in SDDC Manager' post reboot
YYYY-MM-DD HH:MIN: INFO: Updated /var/log/vmware/vcf/lcm/thirdparty/upgrades/a5##-##-##-##-##760/vcf-platform/upgrade/vcf_platform_upgrade.status status file with data OrderedDict([('upgradeId', 'a5##-##-##-##-##760'), ('resourceId', 'ac3##-##-##-##-##bca'), ('upgradeStatusCode', 'INPROGRESS'), ('progress', 70), (' error', OrderedDict([('errorCode', None), ('errorDescription', None)])), ('startTime', 1736376825), ('endTime', 1736377186)])
YYYY-MM-DD HH:MIN: INFO: Rebooting SDDC Manager
YYYY-MM-DD HH:MIN: INFO: Execute cmd: sh -x /var/log/vmware/vcf/lcm/thirdparty/bundles/d5##-##-##-##-##600/thirdparty/reboot_script.sh &
YYYY-MM-DD HH:HH:MIN: INFO: http://localhost/domainmanager/about is not accessible, retry after 10 seconds
YYYY-MM-DD HH:HH:MIN: INFO: URL: http://localhost/domainmanager/about
YYYY-MM-DD HH:HH:MIN: ERROR: RC: , OUT: ERR: Expecting value: line 1 column 1 (char 0)
YYYY-MM-DD HH:HH:MIN: ERROR: Failed to update VCF Services and Photon rpms in SDDC Manager
YYYY-MM-DD HH:HH:MIN: INFO:
YYYY-MM-DD HH:HH:MIN: INFO: RC: 1, OUT:
YYYY-MM-DD HH:HH:MIN: INFO: ERR: Traceback (most recent call last):
File "/var/log/vmware/vcf/lcm/thirdparty/bundles/d5##-##-##-##-##600/thirdparty/vcf-platform-upgrade/bin/vcf_platform_upgrade.py.copy", line 521, in <module>
wrapper.update_status(return_code=1, status='COMPLETED_WITH_FAILURE',
File "/var/log/vmware/vcf/lcm/thirdparty/bundles/d5##-##-##-##-##600/thirdparty/vcf-platform-upgrade/bin/../../wrapper.py", line 187, in update_status
raise Exception
Similar to Domain manager, operations manager service also fails to auto start. Error in /var/log/vmware/vcf/operationsmanager/operationsmanager.log
YYYY-MM-DD HH:MIN INFO [vcf_om,0000000000000000,0000] [com.zaxxer.hikari.HikariDataSource,SpringApplicationShutdownHook] HikariPool-1 - Shutdown initiated...
YYYY-MM-DD HH:MIN DEBUG [vcf_om,677ef1873a72dd65052e80371dfed741,4fa6] [c.v.v.p.v.u.ValidateCredentialsTranslationTaskExecutor,om-exec-1] Exception occurred during validate credentials translation task : Error creating bean with name 'liquibase': Singleton bean creation not allowed while singletons of this factory are in destruction (Do not request a bean from a BeanFactory in a destroy method implementation!)
org.springframework.beans.factory.BeanCreationNotAllowedException: Error creating bean with name 'liquibase': Singleton bean creation not allowed while singletons of this factory are in destruction (Do not request a bean from a BeanFactory in a destroy method implementation!)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:220)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:324)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:200)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:313)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:200)
at org.springframework.beans.factory.support.DefaultListableBeanFactory$1.orderedStream(DefaultListableBeanFactory.java:471)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.detectPersistenceExceptionTranslators(PersistenceExceptionTranslationInterceptor.java:167)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:149)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:135)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:244)
at jdk.proxy2/jdk.proxy2.$Proxy295.findByStatusOrderByCreationTimeAsc(Unknown Source)
at com.vmware.vcf.passwordmanager.validation.utils.ValidateCredentialsTranslationTaskExecutor$1.call(ValidateCredentialsTranslationTaskExecutor.java:53)
at com.vmware.vcf.passwordmanager.validation.utils.ValidateCredentialsTranslationTaskExecutor$1.call(ValidateCredentialsTranslationTaskExecutor.java:47)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at com.vmware.vcf.common.tracing.TraceRunnable.run(TraceRunnable.java:59)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Permission related errors can be found in <service name>.out file. Once such example is operations manager - /var/log/vmware/vcf/operationsmanager/operationsmanager.out
Caused by: java.io.FileNotFoundException: /etc/vmware/vcf/operationsmanager/application.properties (Permission denied)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:111)
at java.base/sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:86)
at java.base/sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:189)
at org.springframework.core.io.UrlResource.getInputStream(UrlResource.java:231)
at org.springframework.boot.origin.OriginTrackedResource.getInputStream(OriginTrackedResource.java:61)
at org.springframework.boot.env.OriginTrackedPropertiesLoader$CharacterReader.<init>(OriginTrackedPropertiesLoader.java:205)
at org.springframework.boot.env.OriginTrackedPropertiesLoader.load(OriginTrackedPropertiesLoader.java:80)
at org.springframework.boot.env.OriginTrackedPropertiesLoader.load(OriginTrackedPropertiesLoader.java:66)
at org.springframework.boot.env.PropertiesPropertySourceLoader.loadProperties(PropertiesPropertySourceLoader.java:70)
at org.springframework.boot.env.PropertiesPropertySourceLoader.load(PropertiesPropertySourceLoader.java:49)
at org.springframework.boot.context.config.StandardConfigDataLoader.load(StandardConfigDataLoader.java:54)
at org.springframework.boot.context.config.StandardConfigDataLoader.load(StandardConfigDataLoader.java:36)
at org.springframework.boot.context.config.ConfigDataLoaders.load(ConfigDataLoaders.java:96)
at org.springframework.boot.context.config.ConfigDataImporter.load(ConfigDataImporter.java:132)
at org.springframework.boot.context.config.ConfigDataImporter.resolveAndLoad(ConfigDataImporter.java:87)
... 29 common frames omitted
You can also confirm R/W permissions against services by listing the files
# systemctl status domainmanager
* domainmanager.service - VMware Cloud Foundation Domain Manager
Loaded: loaded (/etc/systemd/system/domainmanager.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since YYYY-MM-DD HH:MIN UTC; ##s ago
Main PID: ## (code=exited, status=1/FAILURE)
# systemctl status operationsmanager
* operationsmanager.service - VMware Cloud Foundation Operations Manager
Loaded: loaded (/etc/systemd/system/operationsmanager.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since YYYY-MM-DD HH:MIN UTC; ##s ago
Main PID: ## (code=exited, status=1/FAILURE)
VMware Cloud Foundation 5.2.x
The ownership of /etc/vmware/vcf/domainmanager/application.properties file and /etc/vmware/vcf/operationsmanager/application.properties file is set to vcf_sos:vcf
Resolution:
This behavior has been identified as a known issue within the current version of the product. Engineering team is aware and developing a permanent fix. A resolution will be included in a future product release. This article will be updated with more information upon availability of the fix. Meanwhile, use the documented workaround for mitigation.
Workaround:
Revert the Snapshot of the SDDC Manager VM that should have been taken prior to the upgrade attempt. If there is no snapshot prior to upgrade attempt then the following steps will not work.
chown vcf_domainmanager:vcf /etc/vmware/vcf/domainmanager/application.properties
chown vcf_operationsmanager:vcf /etc/vmware/vcf/operationsmanager/application.properties
ls -lrt /etc/vmware/vcf/domainmanager
ls -lrt /etc/vmware/vcf/operationsmanager
systemctl restart domainmanager
systemctl restart operationsmanager
rm -rf /var/log/vmware/vcf/sddc-support/backup-<tab for folder name>