Workload Cluster Upgrade Stuck builtin-generic-v3.3.0 (or higher) clusterclass due to Volume Mount Conflicts

Products

Tanzu Kubernetes Runtime VMware vSphere Kubernetes Service

Issue/Introduction

A workload cluster upgrade is stuck where new control plane machine on the desired KR version is stuck Provisioning or Provisioned state.

Upgrades begin with upgrading all control plane machines to the desired KR version.

Note: Upgrades do not progress to upgrading the worker machines until all control plane machines are healthy on the desired KR version.

While SSH directly to the stuck Provisioned/Provisioning new control plane machine, error messages similar to the following are present:

cat /var/log/cloud-init-output.log

error applying: error applying task .mount: unit with key .mount could not be enabled 
Unit /run/systemd/generator/.mount is transient or generated. 
Failed to run module scripts_user (scripts in /var/lib/cloud/instance/scripts)

See the following documentation: SSH to TKG Service Cluster Nodes as the System User Using a Password

While connected to the Supervisor cluster context, the following symptoms are observed:

The cluster is on the desired KR version:

kubectl get cluster -n <affected workload cluster namespace>

One or more machines on the desired KR version are stuck Provisioned or Provisioning state:
```
kubectl get machines -n <affected workload cluster namespace>
```
The only machines on the desired KR version are stuck in Provisioning or Provisioned state.
- These machines remain in Provisioning or Provisioned state, recreating in a loop (default 120 minutes)

Describing the machines stuck Provisioning show the following messages:

kubectl describe machine -n <workload cluster namespace> <stuck Provisioning node name>

* VirtualMachineProvisioned: Provisioning
* NodeHealthy: Waiting for VSphereMachine to report spec.providerID

machinehealthcheck-controller  Machine <workload cluster namespace>/<stuck Provisioning node name> has unhealthy Node

Describing the kubeadmControlPlane (KCP) for the affected workload cluster shows messages similar to the following:
Note: This symptom is present when the Provisioning machine is a control plane machine.

kubectl get kcp -n <affected workload cluster namespace>

kubectl describe kcp -n <affected workload cluster namespace> <workload cluster kcp name>


Message:               Rolling 3 replicas with outdated spec (1 replicas up to date)

 Message:               * Machine <stuck Provisioning control plane node>
  * EtcdMemberHealthy: Waiting for a Node with spec.providerID vsphere://<provider ID> to exist
* Etcd member <stuck Provisioning control plane node> does not have a corresponding Machine

Message:               Scaling down control plane to 3 replicas (actual 4)

Environment

vSphere Supervisor

Originally deployed on vSphere 7 or 8.0 GA then later directly upgraded to vSphere 8.0u2 (or higher), skipping vSphere 8.0u1

Workload Cluster using or upgrading to builtin-generic-v3.3.0 (or higher) clusterClass

Cause

This issue is related to legacy field ownership conflicts from Server-Side Apply. Further details can be found in KB Workload Cluster Upgrade Stuck on New Node Stuck Provisioning due to Legacy Fields

In this scenario, the workload cluster's KubeadmControlPlane resource contains:

`spec.kubeadmConfigSpec.mounts` which defines volume mounts, processed by cloud-init
References to `machineadm` in `spec.kubeadmConfigSpec.preKubeadmCommands` which is attempting to configure the same volume mounts, processed by machineadm

This dual configuration occurs because of the following factors:

There are legacy managed fields from deprecated API versions (v1beta1) on the resource
The topology reconciler cannot clean up the conflicting `mounts` field due to field ownership boundaries as explained in the KB noted above.
Both cloud-init (processing `mounts`) and machineadm (processing `preKubeadmCommands`) attempt to configure the same volume mounts which causes systemd unit conflicts on the node.

The systemd error occurs because the mount unit is being defined in two different ways which creates a conflict:

Defined once by cloud-init
Defined once by machineadm

Resolution

Note: This issue occurs specifically on vSphere 8.0u2 or higher environments where the upgrade to vSphere 8.0u1 was skipped.

Resolution

This issue is fixed in vSphere Kubernetes Service v3.5.0.

Once the environment is on VKS v3.5.0, the below workaround does not need to be applied to any workload clusters that haven't been upgraded yet.

For Future Workload Cluster Upgrades

See the Additional Information section from the below KB article:

Workload Cluster Upgrade Stuck on New Node Stuck Provisioning due to Legacy Fields

For Workload Clusters Stuck Upgrading Affected By This Issue - Workaround

Connect into the Supervisor cluster context

Temporarily disable the machinehealthcheck (mhc) for the affected workload cluster:

The system uses the mhc to know when to recreate a failing workload cluster node every 10 to 15 minutes.

kubectl patch cluster.v1beta1.cluster.x-k8s.io <workload cluster name> -n <workload cluster namespace> --type=merge -p='{"spec":{"topology":{"controlPlane":{"machineHealthCheck":{"enable":false}}}}}'

Confirm the disable machineHealthCheck flag was set to false successfully on the affected workload cluster:

kubectl get cluster.v1beta1.cluster.x-k8s.io <workload cluster name> -n <workload cluster namespace> -o yaml | grep -A2 machineHealthCheck

Locate any conflicting fields in the KubeadmControlPlane (kcp) for the affected workload cluster:

kubectl get kcp -n <workload cluster namespace>

kubectl get kcp <kcp name> -n <workload cluster namespace> -o jsonpath='{.spec.kubeadmConfigSpec.mounts}'

If any conflicting fields are present, take a backup of the KubeadmControlPlane (kcp):
```
kubectl get kcp <kcp name> -n <workload cluster namespace> -o yaml > kcp_backup.yaml
```

Clean up the conflicting fields on the KubeadmControlPlane (kcp) for the affected workload cluster:

kubectl patch kcp <kcp name> -n <workload cluster namespace> --type=json -p='[{"op": "remove", "path": "/spec/kubeadmConfigSpec/mounts"}]'

Follow the steps under Additional Information in the below KB article:
- Workload Cluster Upgrade Stuck on New Node Stuck Provisioning due to Legacy Fields

IMPORTANT: Re-enable machineHealthChecks on the affected workload cluster:

kubectl patch cluster <workload cluster name> -n <workload cluster namespace> --type=merge -p='{"spec":{"topology":{"controlPlane":{"machineHealthCheck":{"enable":true}}}}}'

Instruct the system to recreate the stuck Provisioned control plane machine:
```
kubectl annotate machine -n <workload cluster namespace> <stuck provisioned machine name> 'cluster.x-k8s.io/remediate-machine=""'
```
Note: The machine may get automatically updated prior to needing being annotated.
Monitor to confirm that the newly created node reaches Running state:
```
watch kubectl get machine,vm -o wide -n <workload cluster namespace>
```

Additional Information

Japanese version: 
ボリュームマウントの競合により、ワークロード クラスターのアップグレードが builtin-generic-v3.3.0 (またはそれ以上) クラスタークラスで停止しました