A vSphere Kubernetes Service (VKS) cluster upgrade is stuck with a Cluster topology reconcile error.
While connected to the Supervisor cluster context, the following symptoms are observed:
kubectl get cluster,clusterbootstrap -n <affected cluster namespace> <affected cluster name>
conditions:
- lastTransitionTime: "YYYY-MM-DDTHH:MM:SSZ"
message: '* TopologyReconciled: error reconciling the Cluster topology: failed
to create patch helper for KubeadmControlPlane <namespace>/<cluster name>-<kcp id>:
server side apply dry-run failed for modified object: KubeadmControlPlane.controlplane.cluster.x-k8s.io
"<cluster name>-<kcp id>" is invalid: [spec.kubeadmConfigSpec.initConfiguration.nodeRegistration.kubeletExtraArgs:
Invalid value: "array": kubeletExtraArgs name must be unique, spec.kubeadmConfigSpec.joinConfiguration.nodeRegistration.kubeletExtraArgs:
Invalid value: "array": kubeletExtraArgs name must be unique]'
kubectl get kcp,md,machines -n <affected VKS cluster namespace>vSphere Supervisor
VKS Cluster
VKS 3.5 or 3.6
A Kubernetes object in the environment is using both v1beta1 and v1beta2 managedFields but was unable to be converted by the conversion webhook which could be due to issues such as network unavailability. If an existing write operation (such as a VKR upgrade) on the object was interrupted, kube-apiserver can drop all managedFields in that object.
This is a known upstream Kubernetes issue: https://github.com/kubernetes/kubernetes/issues/136919
kubectl get cluster -n <affected VKS cluster namespace>
kubectl describe cluster -n <affected VKS cluster namespace> <affected VKS cluster name>
kubectl get kcp -n <affected vks cluster namespace>
NAME CLUSTER
<affected vks cluster name>-<kcp id> <affected vks cluster name>
kubectl get kcp -o yaml --show-managed-fields -ojsonpath='{.metadata.managedFields}' -n <affected vks cluster namespace> <affected vks cluster name>-<kcp id>kubectl get kcp -o yaml --show-managed-fields -n <affected vks cluster namespace> <affected vks cluster name>-<kcp id> | grep "before-first-apply"
manager: before-first-apply
python mitigate-managed-fields.py -n <affected vks cluster namespace> --cluster <affected vks cluster name> --dry-run
python mitigate-managed-fields.py -n <affected vks cluster namespace> --cluster <affected vks cluster name>
kubectl describe cluster -n <affected vks cluster namespace> <affected vks cluster name>