tkgi update-cluster compute profile failure
search cancel

tkgi update-cluster compute profile failure

book

Article ID: 342914

calendar_today

Updated On:

Products

VMware VMware vSphere with Tanzu

Issue/Introduction

Symptoms:
Compute profile allow you to change parameter like the node_pools name as per documented in https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid-Integrated-Edition/1.15/tkgi/GUID-compute-profiles-manage.html
There is some caveat how to deal with the tkgi update-cluster compute profile when it fails.

Environment

Tanzu Kubernetes Grid Integrated Edition 1.1.14.1
VMware Tanzu Kubernetes Grid Integrated Edition 1.x

Cause

The current behaviour of the compute profile feature when changing the node_pools name, rename all Bosh VM instance names at the beginning of the update task then Bosh updates each Bosh VM.

Resolution

Compute profile enhancement:
TKGi 1.15.6+
https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid-Integrated-Edition/1.15/tkgi/GUID-release-notes.html#features-and-enhancements-14
TKGi 1.16.2+
https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid-Integrated-Edition/1.16/tkgi/GUID-release-notes.html#features-and-enhancements-16
TKGi 1.17.0+
https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid-Integrated-Edition/1.17/tkgi/GUID-release-notes.html#features-and-enhancements-16



Workaround:
In order to resume the tkgi update-cluster compute profile after the failure, you may delete the VM instance that is failing then re-trigger the same update-cluster compute profile until it completely succeeds.
bosh update-resurrection off
bosh -d service-instance_ID delete-vm VM_CID
tkgi update-cluster CLUSTER-NAME --compute-profile COMPUTE-PROFILE-NAME
bosh update-resurrection on
Once it is successfully completed, you may revert to the old compute-profile if you need.
If it keeps failing, please contact VMware Tanzu support for helping to troubleshoot the failure



Additional Information

Impact/Risks:
If the command tkgi update-cluster CLUSTER-NAME --compute-profile COMPUTE-PROFILE-NAME would fail when Bosh is updating the service-instance deployment (manual Bosh cancellation or any failure), you should not try to revert straight away to the previous compute profile and use commands like bosh deploy or bosh recreate for fixing manually the service instance deployment, you should NOT run the following commands that would lead Bosh to delete all Bosh VMs in one go before recreating them:
tkgi update-cluster CLUSTER-NAME --compute-profile PREVIOUS-COMPUTE-PROFILE-NAME 
bosh -d service-instance_ID deploy service-instance-MANIFEST.yaml
bosh -d service-instance_ID recreate