Unable to Complete VCF Operations Upgrade from 9.0.0 to 9.0.1 via VCF Fleet Management – Fails at Stage 8 with Error Code: LCMVROPSYSTEM25052
search cancel

Unable to Complete VCF Operations Upgrade from 9.0.0 to 9.0.1 via VCF Fleet Management – Fails at Stage 8 with Error Code: LCMVROPSYSTEM25052

book

Article ID: 413233

calendar_today

Updated On:

Products

VMware Cloud Foundation VCF Operations

Issue/Introduction

During an upgrade of VMware Cloud Foundation (VCF) Operations from version 9.0.0 to 9.0.1 using VCF Fleet Management, the upgrade process fails at Stage 8, Operations Services Check.
 
Error Message:
Error Code: LCMVROPSYSTEM25052
Message: Operations services are not up and running, please retry after some time
Operations services are not up and running for node <VCF_OPS_FQDN>
 
Log Snippet in /var/log/vrlcm/vmware_vrlcm.log:
INFO vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.CustomTrustManager]  -- Fetching certificate chain from https://<VCF_OPS_FQDN>/casa/private/config/slice/certificates/certificate-bundle
INFO vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.NdcHelper]  -- Successfully fetched JWT token from https://<VCF_OPS_FQDN>/casa/private/config/slice/certificates/certificate-bundle
INFO vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.NdcHelper]  -- JWT Headers : {"alg":"RS512","kid":"<CERTIFICATE_THUMBPRINT>"}
INFO vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.NdcHelper]  -- Check whether certificate exist with thumbprint - <CERTIFICATE_THUMBPRINT>
ERROR vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.NdcHelper]  -- No certificate found in trust store for thumbprint <CERTIFICATE_THUMBPRINT>
ERROR vrlcm[2065316] [pool-3-thread-10] [c.v.v.l.u.CustomTrustManager]  -- Failed to trust certificate
java.lang.RuntimeException: No certificate found in trust store for kid <CERTIFICATE_THUMBPRINT>
  
ERROR vrlcm[2065316] [pool-3-thread-41] [c.v.v.l.u.CustomTrustManager]  -- Certificate not trusted for vrops, we will not be able to send notification.
INFO vrlcm[2065316] [pool-3-thread-41] [c.v.v.l.p.c.v.t.VropsServiceStatusTask]  -- VMware Aria Operations casa services are not up and running ::4
ERROR vrlcm[2065316] [pool-3-thread-41] [c.v.v.l.p.c.v.t.VropsServiceStatusTask]  -- casa services are not up and running for node ::<VCF_OPS_FQDN>
INFO vrlcm[2065316] [pool-3-thread-41] [c.v.v.l.p.a.s.Task]  -- Injecting task failure event. Error Code : 'LCMVROPSYSTEM25052', Retry : 'true', Causing Properties : '{ CAUSE :: vropsMasterNodeIP === vropsAdminPassword YXYXYXYX  }'
com.vmware.vrealize.lcm.plugin.common.vrops.exceptions.VropsServicesNotRunningException: Operations services are not up and running for node <VCF_OPS_FQDN>, retry after some time
        at com.vmware.vrealize.lcm.plugin.core.vrops.tasks.VropsServiceStatusTask.checkServicesForAllNodes(VropsServiceStatusTask.java:185) [vmlcm-vropsplugin-core-9.0.1.0-SNAPSHOT.jar!/:?]
        at com.vmware.vrealize.lcm.plugin.core.vrops.tasks.VropsServiceStatusTask.execute(VropsServiceStatusTask.java:87) [vmlcm-vropsplugin-core-9.0.1.0-SNAPSHOT.jar!/:?]
        at com.vmware.vrealize.lcm.platform.automata.service.Task.retry(Task.java:158) [vmlcm-engineservice-core-9.0.1.0-SNAPSHOT.jar!/:?]
        at com.vmware.vrealize.lcm.automata.core.TaskThread.run(TaskThread.java:60) [vmlcm-engineservice-core-9.0.1.0-SNAPSHOT.jar!/:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
        at java.base/java.lang.Thread.run(Unknown Source) [?:?]
 

Environment

VCF Operations 9.0
VCF Fleet Management 9.0.1

Cause

The upgrade failed because VCF Fleet Management could not trust the certificate of the VCF Operations Data node during validation.
The logs revealed that while fetching and validating the JWT token and certificate chain from the one of the VCF Operations node, the trust manager reported: No certificate found in trust store for thumbprint <thumbprint_UUID>
Failed to trust certificate

Resolution

To resolve the issue and resume the upgrade, follow the steps below:
 
Verify Services on VCF Operations Nodes
  • SSH into each VCF Operations node using the root user.
  • Confirm that all services are up and running as expected.
  • Refer to the following kB.
Trust the Certificate via Fleet Management API
  • Log in to the VCF Operations product UI as admin user. (https://<VCF_OPS_FQDN_IP>/ui)
  • Navigate to Developer Center > API & SDKs > Fleet Management API


  • In the Swagger UI enter the Authorize key. Refer to 'How to Authorize VCF Operations Fleet Management API'
  • Scroll to Certificate Management Controller API
  • Use the following POST API call, /lcm/certificate-management/api/certificate-trust


  • Click "Try it Out".


  • In the endpoint field, enter the FQDN mentioned in the error logs (i.e., <VCF_OPS_FQDN>).
  • Click Execute.


  • Ensure the HTTP response code is 200, indicating success.
Retry the Upgrade Task
  • Log in to the VCF Operations product UI as admin user. (https://<VCF_OPS_FQDN_IP>/ui)
  • Navigate to Fleet Management > Lifecycle > Tasks


  • Select the previously failed upgrade task.
  • Click on Retry.


  • The task should now complete successfully.

Additional Information

Ensure network connectivity and DNS resolution between VCF Fleet Management and VCF Operations nodes.