SDDC Manager upgrade fails with ERROR: CAP services are not enabled in SDDC Manager
search cancel

SDDC Manager upgrade fails with ERROR: CAP services are not enabled in SDDC Manager

book

Article ID: 376799

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

  • SDDC Manager upgrade fails at Set up common appliance platform with error CAP services are not enabled in SDDC Manager

  • cap-workflow-engine.service shows as failed when checked with:

          systemctl status cap-workflow-engine.service

  • /var/log/vmware/vcf/lcm/thirdparty/upgrades/<xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx>/vcf-platform/cap-platform-setup/
oneType: None
yyy-mm-dd hh:mm:ss INFO: http://127.0.0.1:15051/capengine/api/v1/workflows is not accessible, retry after 10 seconds
yyy-mm-dd hh:mm:ss: INFO: URL: http://127.0.0.1:15051/capengine/api/v1/workflows
yyy-mm-dd hh:mm:ss: ERROR: RC: , OUT:  ERR: HTTPConnectionPool(host='127.0.0.1', port=15051): Max retries exceeded with url: /capengine/api/v1/workflows (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at aaaaaaaaaaa>: Failed to establish a new connection: [Errno 111] Connection refused')) NoneType: None
yyy-mm-dd hh:mm:ss: INFO: http://127.0.0.1:15051/capengine/api/v1/workflows is not accessible, retry after 10 seconds
yyy-mm-dd hh:mm:ss INFO: URL: http://127.0.0.1:15051/capengine/api/v1/workflows
yyy-mm-dd hh:mm:ss: ERROR: RC: , OUT:  ERR: HTTPConnectionPool(host='127.0.0.1', port=15051): Max retries exceeded with url: /capengine/api/v1/workflows (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2d0c39cf10>: Failed to establish a new connection: [Errno 111] Connection refused')) NoneType: None
yyy-mm-dd hh:mm:ss INFO: http://127.0.0.1:15051/capengine/api/v1/workflows is not accessible, retry after 10 seconds
yyy-mm-dd hh:mm:ss: INFO: URL: http://127.0.0.1:15051/capengine/api/v1/workflows
yyy-mm-dd hh:mm:ss: ERROR: RC: , OUT:  ERR: HTTPConnectionPool(host='127.0.0.1', port=15051): Max retries exceeded with url: /capengine/api/v1/workflows (Caused by NewConnectionError('<urllib3.connect with data OrderedDict([('upgradeId', '<xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx>'), ('resourceId', ''), ('upgradeStatusCode', 'COMPLETED_WITH_FAILURE'), ('progress', 0), ('error', OrderedDict([('errorCode', 2), ('errorDescription', 'http://127.0.0.1:15051/capengine/api/v1/workflows is not accessible')])), ('startTime', 1724776678), ('endTime', 1724777010)])
yyy-mm-dd hh-mm-ss,154: ERROR:
Traceback (most recent call last):
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/cap_platform_setup.py.copy", line 362, in <module> cap_upgraded, cap_header = wrapper.upgrade_cap(CAP_PLATFORM_SETUP_LIBRARY_PATH)
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 574, in upgrade_cap  return self.is_cap_service_running(cap_header), cap_header
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 319, in is_cap_service_running errmsg=error_message)
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 187, in update_status  raise Exception
Exception
yyy-mm-dd hh-mm-ss,155: INFO: URL: http://localhost/lcm/about

yyy-mm-dd hh-mm-ss: INFO: Updated /var/log/vmware/vcf/lcm/thirdparty/upgrades/<xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx>/vcf-platform/cap-platform-setup/cap_platform_setup.status status file with data OrderedDict([('upgradeId', '<xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx>'), ('resourceId', '<yyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyy>'), ('upgradeStatusCode', 'COMPLETED_WITH_FAILURE'), ('progress', 0), ('error', OrderedDict([('errorCode', 2), ('errorDescription', 'http://127.0.0.1:15051/capengine/api/v1/workflows is not accessible')])), ('startTime', 1724776678), ('endTime', 1724777010)])
yyy-mm-dd hh-mm-ss: ERROR: CAP services are not enabled in SDDC Manager
yyy-mm-dd hh-mm-ss: INFO:
yyy-mm-dd hh-mm-ss: INFO: RC: 1, OUT:
yyy-mm-dd hh-mm-ss: INFO: ERR: Traceback (most recent call last):
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/cap_platform_setup.py.copy", line 362, in <module>   cap_upgraded, cap_header = wrapper.upgrade_cap(CAP_PLATFORM_SETUP_LIBRARY_PATH)
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 574, in upgrade_cap  return self.is_cap_service_running(cap_header), cap_header
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 319, in is_cap_service_running  errmsg=error_message)
  File "/var/log/vmware/vcf/lcm/thirdparty/bundles/<zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzz>/thirdparty/cap-platform-setup/bin/../../wrapper.py", line 187, in update_status  raise Exception

 

  • cap engine's core-engine log:

 

yyy-mm-dd hh:mm:ss workflowconfig.go:286: Workflow JSON Object: &{Name:cap-update-revert ExecOrder:[snapshot-check lvm-revert revert-update-changes file-sync-service-setup] TaskExtensionPath: RebootRequired:false TaskList:[{Name:snapshot-check PluginPath:/usr/lib/vmware-capengine/coreplugins/snapshotcheck.so ScriptPath: ScriptArgs:[] IsResumable:false MaxRetryCount:3 ErrorHandlerName: IsReserved:true IsExtensionTask:false} {Name:lvm-revert PluginPath:/usr/lib/vmware-capengine/coreplugins/lvmrevert.so ScriptPath: ScriptArgs:[] IsResumable:false MaxRetryCount:3 ErrorHandlerName: IsReserved:true IsExtensionTask:false} {Name:revert-update-changes PluginPath:/usr/lib/vmware-capengine/coreplugins/revertupdate.so ScriptPath: ScriptArgs:[] IsResumable:false MaxRetryCount:3 ErrorHandlerName: IsReserved:true IsExtensionTask:false} {Name:file-sync-service-setup PluginPath:/usr/lib/vmware-capengine/coreplugins/root_file_sync.so ScriptPath: ScriptArgs:[] IsResumable:false MaxRetryCount:3 ErrorHandlerName: IsReserved:true IsExtensionTask:false}] ErrorHandlers:map[]}
yyy-mm-dd hh:mm:ss database.go:464: Executing Query: [SELECT _id, workflowId, name, taskOrder, createdTime, lastModifiedTime, isTaskDeleted FROM ERROR_HANDLER WHERE workflowId = ?] with parameters [%!s(int=10)]
yyy-mm-dd hh:mm:ss main.go:24: Invalid workflows found.
yyy-mm-dd hh:mm:ss main.go:25: Failed to start Common Appliance Platform Workflow Engine

Environment

VMware Cloud Foundation 4.5.1

Cause

When the workflow is altered , the new binary path(s) is not consistent with path referred in altered workflow. Due to which, it would fail to start cap workflow engine.

Resolution

Take the snapshot of the SDDC VM

1. SSH into the SDDC manager VM using VCF as user name and switch to root

2. Remove the workflow database

rm -rf /storage/lifecycle/capengine/workflow.db

3.Check files installed by CAP services on SDDC Manager 

rpm -ql Vmware-capengine

4. Uninstall the Vmware-capengine and Vmware-capupdate RPMs to remove the corrupted workflow definition

rm -rf /usr/lib/vmware-capengine/

rm -rf /usr/lib/vmware-capupdate/

tdnf  --disablerepo=* remove Vmware-capengine Vmware-capupdate

rm -rf /etc/vmware/cap/

rm -rf /usr/lib/vmware-capengine/

5.Install the same RPMs again Vmware-capengine and Vmware-capupdate

  • Location for the RPMs - /var/log/vmware/vcf/lcm/thirdparty/bundles/<bundle_id>/thirdparty/cap-platform-setup/lib
    rpm -i Vmware-capengine-1.0.0.3-10001489.x86_64.rpm
    rpm -i Vmware-capupdate-1.0.0.3-10001489.x86_64.rpm

 

6. Start the Cap services

systemctl start cap-workflow-engine.service

7.Confirm the cap services are running

systemctl status cap-workflow-engine.service

8. Re-try the upgrade