This problem is confirmed a a bug on vCenter 8 U1x and is resolved in 8 U2 same behaviour is not observed anymore
some additional analisys completed to confirm the issue:
Verified with below commands:
Where Machine/cluster-update-test-01-f4w75-847jc is in pending state we could trace from cluster back to the KubeadmConfig
Where the KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-fg575 had a missing secret defined in it we have manually added (copy past the secret pointed from another KubeadmConfig
which trigger creation and adding the node to the cluster and system continued with next machine but got stuck on the same step
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-tzlhj -oyam provided below
kubectl get cluster -a
kubectl get cluster -A
kubectl describe cluster -n vxvcftanzu01 cluster-update-test-01
kubectl get kubeadmcontrolplane -A
kubectl get kubeadmcontrolplane -n vxvcftanzu01 cluster-update-test-01-f4w75
kubectl get kubeadmcontrolplane -n vxvcftanzu01 cluster-update-test-01-f4w75 -oyaml
kubectl get -n vxvcftanzu01 Machine/cluster-update-test-01-f4w75-847jc
kubectl get KubeadmConfig -A
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-fg575 -oyaml
kubectl get KubeadmConfig -A
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-vfnpt-22wl6 -oyaml
kubectl edit KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-fg575
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-fg575 -oyaml
kubectl get vspheremachines -A
kubectl get vspheremachines,machines -A
kubectl get KubeadmConfig -A
kubectl get KubeadmConfig,machines -A | grep cluster-update-test-01
kubectl get KubeadmConfig -A
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-tzlhj
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-tzlhj -oyam
kubectl get KubeadmConfig -n vxvcftanzu01 cluster-update-test-01-node-pool-1-bootstrap-8gtgc-tzlhj -oyaml
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
annotations:
cluster.x-k8s.io/cloned-from-groupkind: KubeadmConfigTemplate.bootstrap.cluster.x-k8s.io
cluster.x-k8s.io/cloned-from-name: cluster-update-test-01-node-pool-1-bootstrap-8gtgc
run.tanzu.vmware.com/resolve-os-image: os-name=photon
creationTimestamp: "2023-09-29T12:57:08Z"
generation: 2
labels:
cluster.x-k8s.io/cluster-name: cluster-update-test-01
cluster.x-k8s.io/deployment-name: cluster-update-test-01-node-pool-1-v8gnm
cluster.x-k8s.io/set-name: cluster-update-test-01-node-pool-1-v8gnm-5d48844f9d
machine-template-hash: "1804400958"
topology.cluster.x-k8s.io/deployment-name: node-pool-1
topology.cluster.x-k8s.io/owned: ""
name: cluster-update-test-01-node-pool-1-bootstrap-8gtgc-tzlhj
namespace: vxvcftanzu01
ownerReferences:
- apiVersion: cluster.x-k8s.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Machine
name: cluster-update-test-01-node-pool-1-v8gnm-5d48844f9d-qnvrk
uid: d6200ed5-3547-48bd-86a1-3728159b3b4a
resourceVersion: "324274822"
uid: 3307a9d4-64a2-44bc-957f-770aec58dd62
spec:
diskSetup: {}
files:
- content: |
{{ ds.meta_data.hostname.split('.') | first }}
owner: root:root
path: /etc/hostname
permissions: "0644"
- content: |
::1 ipv6-localhost ipv6-loopback
127.0.0.1 localhost {{ ds.meta_data.hostname.split('.') | first }}
owner: root:root
path: /etc/hosts
permissions: "0644"
- contentFrom:
secret:
key: <no value>
name: <no value>
owner: root:root
path: /etc/ssl/certs/extensions-tls.crt
permissions: "0644"
format: cloud-config
joinConfiguration:
discovery:
bootstrapToken:
apiServerEndpoint: x.x.x.x:6443
caCertHashes:
- sha256:xxxxx
token: vaq720.xige91v55pvrg8fx
nodeRegistration:
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
event-qps: "0"
node-labels: run.tanzu.vmware.com/tkr=v1.24.9---vmware.1-tkg.4,run.tanzu.vmware.com/kubernetesDistributionVersion=v1.24.9---vmware.1-tkg.4,
protect-kernel-defaults: "true"
read-only-port: "0"
register-with-taints: ""
resolv-conf: /run/systemd/resolve/resolv.conf
tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
ntp:
enabled: true
servers:
- <no value>
postKubeadmCommands:
- touch /root/kubeadm-complete
- vmware-rpctool 'info-set guestinfo.kubeadm.phase complete'
- vmware-rpctool 'info-set guestinfo.kubeadm.error ---'
preKubeadmCommands:
- set -xe
- cloud-init single --name write-files --frequency always
- cloud-init single --name users-groups --frequency always
- vmware-rpctool 'info-set guestinfo.userdata ---'
- hostname "{{ ds.meta_data.hostname.split('.') | first }}"
- 'sed -i -e "s/^preserve_hostname: .*/preserve_hostname: true/" /etc/cloud/cloud.cfg'
- echo -e 'kernel.panic_on_oops=1\nkernel.panic=10\nvm.overcommit_memory=1' >> /etc/sysctl.d/kubelet.conf
&& sysctl -p /etc/sysctl.d/kubelet.conf
- uname -a | grep photon && /usr/bin/rehash_ca_certificates.sh
- uname -a | grep ubuntu && cp /etc/ssl/certs/extensions-tls.crt /usr/local/share/ca-certificates/
- uname -a | grep ubuntu && /usr/sbin/update-ca-certificates
- systemctl set-property docker.service TasksMax=infinity
- systemctl daemon-reload
- systemctl enable containerd
- systemctl is-enabled --quiet containerd.service && systemctl restart containerd.service
- 'if systemctl is-enabled --quiet containerd.service ; then running=false; for
_ in {1..15}; do crictl ps > /dev/null 2>&1 && running=true && break; sleep 1s;
done; if [[ "${running}" != true ]]; then echo ''WARNING: containerd may not be
running''; exit 1; fi; fi'
- uname -a | grep photon && systemctl start docker.service
- uname -a | grep ubuntu && systemctl enable kubelet
- uname -a | grep ubuntu && systemctl start kubelet
- if [ -f /root/kubeadm-complete ]; then echo "Kubeadm already completed - terminating
early"; exit 0; fi
useExperimentalRetryJoin: true
verbosity: 2
status:
conditions:
- lastTransitionTime: "2023-09-29T12:57:09Z"
message: 'failed to resolve file source: secret not found: vxvcftanzu01/<no value>:
secrets "<no value>" not found'
reason: DataSecretGenerationFailed
severity: Warning
status: "False"
type: Ready
- lastTransitionTime: "2023-09-29T12:57:09Z"
status: "True"
type: CertificatesAvailable
- lastTransitionTime: "2023-09-29T12:57:09Z"
message: 'failed to resolve file source: secret not found: vxvcftanzu01/<no value>:
secrets "<no value>" not found'
reason: DataSecretGenerationFailed
severity: Warning
status: "False"
type: DataSecretAvailable
Symptoms:
When a new classy cluster is created following yaml example from here: https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-with-tanzu-tkg/GUID-607BA980-E3E3-4167-ABC8-B9FCDCF44746.html
Where only the corresponding name namespace storage etc are updated to reflect the current deployment
After the creation of the cluster with kubectl apply -f ... we waited until the cluster was completely created.
Then we updated the existing cluster to upgrade the vmclass from guaranteed-small to guaranteed-medium.
The status of the created cluster object seems to show the machine rollover in the status conditons but nothing happens.
status:
conditions:
- lastTransitionTime: "2023-09-15T09:37:21Z"
message: Rolling 3 replicas with outdated spec (1 replicas up to date)
reason: RollingUpdateInProgress
severity: Warning
status: "False"
type: Ready
- lastTransitionTime: "2023-09-15T09:29:08Z"
status: "True"
type: ControlPlaneInitialized
- lastTransitionTime: "2023-09-15T09:37:21Z"
message: Rolling 3 replicas with outdated spec (1 replicas up to date)
reason: RollingUpdateInProgress
severity: Warning
status: "False"
type: ControlPlaneReady
- lastTransitionTime: "2023-09-15T09:27:22Z"
status: "True"
type: InfrastructureReady
- lastTransitionTime: "2023-09-15T09:27:18Z"
status: "True"
type: TopologyReconciled
- lastTransitionTime: "2023-09-15T09:27:05Z"
message: '[v1.24.11+vmware.1-fips.1-tkg.1]'
status: "True"
type: UpdatesAvailable
controlPlaneReady: true
failureDomains:
vmware-system-legacy:
controlPlane: true
infrastructureReady: true
observedGeneration: 5
phase: Provisioned
There is no new machine created if we check the vcenter inventory.
We checked events of the supervisor cluster and it seems the controller tries to do some preflight checks on machines.
17m Normal TopologyCreate machinehealthcheck/cluster-update-test-01-qhmwr Created "MachineHealthCheck/cluster-update-test-01-qhmwr"
17m Warning ReconcileError machinehealthcheck/cluster-update-test-01-qhmwr failed to create cluster accessor: error fetching REST client config for remote cluster "vxvcftanzu01/cluster-update-test-01": failed to retrieve kubeconfig secret for Cluster vxvcftanzu01/cluster-update-test-01: secrets "cluster-update-test-01-kubeconfig" not found
16m Warning ReconcileError machinehealthcheck/cluster-update-test-01-qhmwr failed to create cluster accessor: error creating dynamic rest mapper for remote cluster "vxvcftanzu01/cluster-update-test-01": Get "https://10.50.0.2:6443/api?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
15m Warning ControlPlaneUnhealthy kubeadmcontrolplane/cluster-update-test-01-qhmwr Waiting for control plane to pass preflight checks to continue reconciliation: [machine cluster-update-test-01-qhmwr-5p4j2 does not have APIServerPodHealthy condition, machine cluster-update-test-01-qhmwr-5p4j2 does not have ControllerManagerPodHealthy condition, machine cluster-update-test-01-qhmwr-5p4j2 does not have SchedulerPodHealthy condition, machine cluster-update-test-01-qhmwr-5p4j2 does not have EtcdPodHealthy condition, machine cluster-update-test-01-qhmwr-5p4j2 does not have EtcdMemberHealthy condition]
13m Warning ControlPlaneUnhealthy kubeadmcontrolplane/cluster-update-test-01-qhmwr Waiting for control plane to pass preflight checks to continue reconciliation: [machine cluster-update-test-01-qhmwr-qgjgj does not have APIServerPodHealthy condition, machine cluster-update-test-01-qhmwr-qgjgj does not have ControllerManagerPodHealthy condition, machine cluster-update-test-01-qhmwr-qgjgj does not have SchedulerPodHealthy condition, machine cluster-update-test-01-qhmwr-qgjgj does not have EtcdPodHealthy condition, machine cluster-update-test-01-qhmwr-qgjgj does not have EtcdMemberHealthy condition]
12m Warning ControlPlaneUnhealthy kubeadmcontrolplane/cluster-update-test-01-qhmwr Waiting for control plane to pass preflight checks to continue reconciliation: machine cluster-update-test-01-qhmwr-qgjgj reports ControllerManagerPodHealthy condition is false (Error, Pod kube-controller-manager-cluster-update-test-01-qhmwr-qgjgj is missing)
2m22s Warning ControlPlaneUnhealthy kubeadmcontrolplane/cluster-update-test-01-qhmwr Waiting for control plane to pass preflight checks to continue reconciliation: [machine cluster-update-test-01-qhmwr-5wv6h does not have APIServerPodHealthy condition, machine cluster-update-test-01-qhmwr-5wv6h does not have ControllerManagerPodHealthy condition, machine cluster-update-test-01-qhmwr-5wv6h does not have SchedulerPodHealthy condition, machine cluster-update-test-01-qhmwr-5wv6h does not have EtcdPodHealthy condition, machine cluster-update-test-01-qhmwr-5wv6h does not have EtcdMemberHealthy condition]
Steps to reproduce:
- create a complete new empty cluster with testcluster-small.yaml
- update cluster with testcluster-medium.yaml
VMware vSphere 7.0 with Tanzu
Some variable that are not initially defined when second apply operation is executed the variables are missing from the respective KubeadmConfig
This violates
Kuberneters #1 design rule is the declarative approach.
This is violated by this as it makes an infra as code approach completely impossible.
When an object is applied and re-apply it again doing changes, the api endpoint and the controller behind it has to care about changes and reconcile as long as kind: metadata.name: and namespace: stays the same.
Things the controller added to the original manifest to keep track of the state ( .status, various labels, and in this case things like the TKR_DATA block) should definitely not be needed to be returned into the original manifest - or at least, it should merge them on a re-apply.
If it needs some additional information to keep things running these need to be required at creation time.
The recommendation is to avoid the method of applying the same yaml with changes while this bug is present.
And upgrade to vCenter U2
But if a customer is already in this state, they can attempt the following to restore the cluster:
- kubectl edit the cluster. Set spec.paused to true and add the annotation run.tanzu.vmware.com/pause: ""
- This will signal to the controller to repopulate the variables. Customer can confirm by checking the cluster resource.
- machine deployments will automatically rollout with properly set variables.
- some control plane machines may still be stuck in Pending with the old broken configuration. Deleting those machines will result in new machines being created with the new configuration.
Workaround:
If customer is already in this state, they can attempt the following to restore the cluster:
- kubectl edit the cluster. Set spec.paused to true and add the annotation run.tanzu.vmware.com/pause: ""
- This will signal to the controller to repopulate the variables. Customer can confirm by checking the cluster resource.
- machine deployments will automatically rollout with properly set variables.
- some control plane machines may still be stuck in Pending with the old broken configuration. Deleting those machines will result in new machines being created with the new configuration.