Workload Cluster Upgrade from KR 1.32.x to 1.33.x Stalls on First Control Plane Node Provisioned

search cancel

Workload Cluster Upgrade from KR 1.32.x to 1.33.x Stalls on First Control Plane Node Provisioned

book

Article ID: 410481

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

In a vSphere Supervisor environment, when upgrading an existing workload cluster from KR 1.32.x to 1.33.x, you may notice that the upgrade stalls because the first control plane VM fails to initialize successfully. In this situation, the below command will demonstrate that one control plane node is stuck in a provisioned state:

While connected to the Supervisor cluster context, the following symptoms are observed:

In a single control plane node workload cluster, the second, new control plane node created on the desired 1.33.x version is stuck in Provisioned state:

kubectl get machine -n <affected cluster namespace> | grep <affected workload cluster-name>

cluster-namespace     <control plane node 1>   affected workload cluster-name   Running        #d#h   v1.32.3+vmware.1-fips
cluster-namespace     <control plane node 2>   affected workload cluster-name   Provisioned    #m     v1.33.1+vmware.1-fip

In a three control plane node workload cluster, there is a new, fourth control plane node created on the desired 1.33.x version but it is stuck in Provisioned state:

kubectl get machine -n <affected cluster namespace> | grep <affected workload cluster-name>

cluster-namespace     <control plane node A>   affected workload cluster-name   Running        #d#h   v1.32.3+vmware.1-fips
cluster-namespace     <control plane node B>   affected workload cluster-name   Running        #d#h   v1.32.3+vmware.1-fips
cluster-namespace     <control plane node C>   affected workload cluster-name   Running        #d#h   v1.32.3+vmware.1-fips
cluster-namespace     <control plane node D>   affected workload cluster-name   Provisioned    #m     v1.33.1+vmware.1-fip

Note: Workload cluster upgrades always begin with control plane nodes and will not proceed to the worker nodes until all control plane nodes have successfully upgraded.

While SSH into the stuck new control plane node on the desired KR 1.33 version:

The kube-apiserver is failing in Exited state and the following error is observed in the new control plane node's kube-apiserver logs:

crictl ps -a --name kube-apiserver
CONTAINER                 IMAGE        CREATED       STATE       NAME
<api-server-container-id> <IMAGE ID>  <creation time>  Exited     kube-apiserver

crictl logs <api-server-container-id>
Error: unknown flag: --cloud-provider

In a situation, crictl ps does not show any process and kubelet service fails to start.

kubelet.service - kubelet: machine-agent override
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf, 11-resource-sizing.conf
             /etc/systemd/system/kubelet.service.d
             └─99-machine-agent.conf
     Active: inactive (dead) since DAY  YYYY-MM-DD HH:MM:DD UTC; s/min ago
       Docs: https://kubernetes.io/docs/home/
             https://github-vcf.devops.broadcom.net/vcf/vks-machine-agent
    Process: 2449 ExecStartPre=/usr/libexec/kubernetes/kubelet-resource-sizing.sh (code=exited, status=0/SUCCESS)
    Process: 2465 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --system-reserved=cpu=70m,memory=1>
   Main PID: 2465 (code=exited, status=0/SUCCESS)
        CPU: 3.210s

Environment

vSphere Supervisor

VKS 3.4 or above

Cause

The --cloud-provider API server flag was removed starting with Kubernetes v1.33.0.

If this flag is still present in a cluster’s configuration, the API server will fail to initialize due to clashing fields in the KubeadmControlPlane resource. A detailed explanation is given in KB: Steps to Perform Before Upgrading to ClusterClass 3.3.0 or KR 1.32.X to 1.33.X

Resolution

This issue is fixed in vSphere Kubernetes Service 3.5.0, and the procedure below does not need to be carried out if the service is upgraded to v3.5.0 before cluster upgrade.

To resolve this issue, remove the deprecated --cloud-provider=external flag from the cluster configuration after the upgrade has stalled.

All commands below are executed from the supervisor cluster context.

Procedure

Connect into the Supervisor cluster context

Identify the cluster that is stuck and note its name:

kubectl get cluster <affected cluster name> -n <affected cluster namespace>

Locate and patch the KubeadmControlPlane (KCP) object for the affected cluster:
1. Retrieve the name of the KCP for the affected workload cluster:
```
kubectl get kcp -n <affected workload cluster namespace>
```
2. Review the kcp yaml and verify that there are entries for "f:cloud-provider":
```
kubectl get kcp <kcp-name> -n <namespace> -o yaml --show-managed-fields | grep "f:cloud-provider" -A4
```
  or
```
kubectl get kcp <kcp-name> -n <namespace> --show-managed-fields -ojsonpath='{.metadata.managedFields}' | less
```
  (You may only get output for two entries, whereby no entry has ‘f:encryption-provider-config’. In such a case, skip to step 4.)
  
  The below is an example and the order may vary in your environment:
```
                f:cloud-provider: {}
                f:enable-admission-plugins: {}
                f:encryption-provider-config: {}
                f:profiling: {}
                f:runtime-config: {}
                f:tls-cipher-suites: {}
--
                f:cloud-provider: {}
                f:event-qps: {}
                f:node-ip: {}
                f:node-labels: {}
                f:protect-kernel-defaults: {}
                f:read-only-port: {}
--
                f:cloud-provider: {}
                f:event-qps: {}
                f:node-ip: {}
                f:node-labels: {}
                f:protect-kernel-defaults: {}
                f:read-only-port: {}
```
3. From the previous step's output, find and make a note of which entry has "f:cloud-provider" listed in "fieldsV1"."f:spec"."f:kubeadmConfigSpec"."f:clusterConfiguration"."f:apiServer"."f:extraArgs".
  If that entry shows up first, second or third.
  For instance, in the above step's example output, the first entry has this value.
4. Once you have identified the managed field entry that fulfils the criteria in step 3C, determine its zero-based index in the list(note this is the list of the managed-field entries). Use this value for REPLACEME in the next steps.
  Note that it is possible that no managed field entry has "fieldsV1"."f:spec"."f:kubeadmConfigSpec"."f:clusterConfiguration"."f:apiServer"."f:extraArgs"."f:cloud-provider". In such a case, skip to step #4.
  
  To determine the entries zero based index:
  1. If the entry was the first entry, the number will be zero: 0
  2. If the entry was the second entry, the number will be one: 1
  3. If the entry was the third entry, the number will be two: 2
5. Run kubectl patch to remove API server flag:
  1. Create a backup of the KCP object:
```
kubectl get kcp <kcp-name> -n <affected workload cluster namespace> -o yaml --show-managed-fields > kcp_backup.yaml
```
  2. Perform the patch command with --dry-run:
    
    IMPORTANT: REPLACEME should be replaced with the number determined from the previous step:
```
kubectl patch kcp <kcp-name> -n <namespace> --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/REPLACEME"}]' --dry-run=server
```
    If successful, the following message will be returned:
```
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/<kcp-name> patched
```
  3. If the above patch dry-run is successful, perform the below command without the dry-run flag:
```
kubectl patch kcp <kcp-name> -n <namespace> --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/REPLACEME"}]'
```

Perform an edit on the KCP to remove the cloud-provider: external line:

kubectl edit kcp -n <kcp-name> <affected workload cluster namespace>

spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
          admission-control-config-file: /etc/kubernetes/extra-config/admission-control-config.yaml
          allow-privileged: "true"
          audit-log-maxage: "30"
          audit-log-maxbackup: "10"
          audit-log-maxsize: "100"
          audit-log-path: /var/log/kubernetes/kube-apiserver.log
          audit-policy-file: /etc/kubernetes/extra-config/audit-policy.yaml
          authentication-token-webhook-config-file: /webhook/webhook-config.yaml
          authentication-token-webhook-version: v1
          client-ca-file: /etc/ssl/certs/extensions-tls.crt
          cloud-provider: external

Recreate the control plane machine that is stuck on the desired version in Provisioned state:
1. First, identify the machine:
```
kubectl get machine -n <workload cluster namespace> | grep Provisioned

cluster-namespace     <stuck provisioned machine>       affected workload cluster-name   Provisioned    #m    v1.33.1+vmware.1-fip
```
2. Then, annotate the machine to instruct the system to recreate it:
```
kubectl annotate machine -n <workload cluster namespace> <stuck provisioned machine> 'cluster.x-k8s.io/remediate-machine=""'
```
3. Confirm that the control plane node recreates and reaches Running state:
```
kubectl get machine -n <workload cluster namespace>
```
  (In some instances, the stuck Machine resource is quickly deleted after editing the kcp, and before any user has time to annotate the machine for remediation.)

After all of the above steps are followed, the upgrade will progress as expected.

Preventing the issue on other clusters on vSphere Kubernetes Service v.3.4.0 and below

Follow the steps in KB: Steps to Perform Before Upgrading to ClusterClass 3.3.0 or KR 1.32.X to 1.33.X

Feedback

thumb_up Yes

thumb_down No