After upgrading cluster's k8s version from v1.27 to v1.28 the upgrade is stuck and no new nodes are being created with the newer applied version.
It is noted that the kcp, md, vm, machine etc all still have reference to the v1.27
A describe of the cluster shows the clusters version as v1.28 and the following message is observed:
TopologyReconciled: error computing the desired state of the Cluster topology: failed to apply patches: failed to generate patches for patch "default": failed to call extension handler "generate-patches.runtime-extension": got failure responsevCenter Version: 8.0.3 Build: 24322831
ESX Version: 8.0.3 Build: 2441450
AVI LB Version: 22.1.7 Build: 9093
The upgrade failed due to missing TKR_DATA for 1.27.11.
Also when using builtin-generic-v3.1.0 you should not use 'kubectl apply' to upgrade.
Note - with builtin-generic-v3.1.0 only 'kubectl edit' should be used to upgrade.
To workaround this issue, the following KB 383750 should be used.
Note you must use 'kubectl edit' to edit the cluster.
When editing the below snippet under value field of spec.topology.variables where name=TKR_DATA must be appended to the cluster's yaml.
variables:
- name: TKR_DATA
value:
# append below
v1.27.11+vmware.1-fips.1:
kubernetesSpec:
coredns:
imageRepository: localhost:5000/vmware.io
imageTag: v1.10.1_vmware.16-fips.1
etcd:
imageRepository: localhost:5000/vmware.io
imageTag: v3.5.11_vmware.6-fips.1
imageRepository: localhost:5000/vmware.io
pause:
imageRepository: localhost:5000/vmware.io
imageTag: "3.9"
version: v1.27.11+vmware.1-fips.1
labels:
image-type: vmi
os-arch: amd64
os-name: ubuntu
os-type: linux
os-version: "22.04"
run.tanzu.vmware.com/os-image: vmi-##################
run.tanzu.vmware.com/tkr: v1.27.11---vmware.1-fips.1-tkg.2
vmi-name: vmi-##################
osImageRef:
name: vmi-##################