This workaround allows a user to address the issue with the control plane getting stuck during the upgrade in TCA version 2.2.0 to 2.3.0
command is ran from the master node of the workload cluster, we observe a newly created control plane node listing the old version, as seen below in bold:capv@workload-control-plane-blg5r [ ~ ]$ kubectl get node
NAME STATUS ROLES AGE VERSION
workload-np1-#####-rvhq8 Ready <none> 27h v1.23.16+vmware.1
workload-np2-#####-7qglk Ready <none> 27h v1.23.16+vmware.1
workload-control-plane-4tvtz Ready control-plane 27h v1.23.16+vmware.1
workload-control-plane-98z9f Ready control-plane 27h v1.23.16+vmware.1
workload-control-plane-blg5r Ready control-plane 27h v1.23.16+vmware.1
workload-control-plane-stm1a Ready control-plane 5m v1.23.16+vmware.1
2.3
Since TCA 2.1, due to a known issue upstream, control plane upgrades must be performed in two steps.
If the first upgrade step fails, the upgrade case targeting Kubernetes version v1.24.10+vmware.1 will become blocked.
Addressed in TCA 3.0.
The following workaround will permanently address the issue in TCA 2.3:
Run the following command to delete the machine:
kubectl delete machine -n <WORKLOAD-CLUSTER-NAME> <MACHINE-NAME>
Sample:
kubectl delete machine -n workload workload-control-plane-stm1a