TKGm Management Cluster upgrade fails due to Providers versions mismatch
search cancel

TKGm Management Cluster upgrade fails due to Providers versions mismatch

book

Article ID: 374425

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

Management Cluster upgrade fails with below error messages:

kubeadm provider's upgrade version vX.X.X is less than current version vX.X.X, so skipping it for upgrade.
cluster-api provider's upgrade version vX.X.X is less than current version vX.X.X, so skipping it for upgrade.
vsphere provider's upgrade version vX.X.X is less than current version vX.X.X, so skipping it for upgrade.
Failed waiting for provider <provider-name> after <>s
Error: error waiting for provider components to be up and running after upgrading them: Error getting provider component: failed to read "<bootstrap-components>.yaml": no such file or directory

Cause

The issue has been seen during TCA failed upgrades where the Management Cluster was partially upgraded, but didn't complete the upgrade successfully.
The Providers' versions were updated to the target MC version.
Manually triggering the upgrade again failed due to the Providers being already updated.

Resolution

We need to "downgrade" the Provider objects' versions to the original ones the upgrade script is expecting.

  1. From the Management Context:
    # kubectl get provider -A

    For example:

    $ kubectl get provider -A
    NAMESPACE                           NAME                     AGE    TYPE                     PROVIDER      VERSION
    caip-in-cluster-system              ipam-in-cluster          3d5h   IPAMProvider             in-cluster    v0.1.0
    capi-kubeadm-bootstrap-system       bootstrap-kubeadm        3d5h   BootstrapProvider        kubeadm       v1.4.2
    capi-kubeadm-control-plane-system   control-plane-kubeadm    3d5h   ControlPlaneProvider     kubeadm       v1.4.2
    capi-system                         cluster-api              3d5h   CoreProvider             cluster-api   v1.4.2
    capv-system                         infrastructure-vsphere   3d5h   InfrastructureProvider   vsphere       v1.7.0

  2. For each Provider that doesn't match the expected original version, edit the object and change the version manually.
    # kubectl edit provider <provider-name> -n <namespace-name>

    For example, if we're seeing the following message in the MC upgrade output: "vsphere provider's upgrade version v1.5.3 is less than current version v1.7.0, so skipping it for upgrade.", we need to downgrade "infrastructure-vsphere" Provider to v1.5.3.

    # kubectl edit provider infrastructure-vsphere -n capv-system

    After editing the version to v1.5.3, it should look as follows:

    apiVersion: clusterctl.cluster.x-k8s.io/v1alpha3
    kind: Provider
    metadata:
      creationTimestamp: "2024-08-09T05:59:39Z"
      generation: 1
      labels:
        cluster.x-k8s.io/provider: infrastructure-vsphere
        clusterctl.cluster.x-k8s.io: ""
        clusterctl.cluster.x-k8s.io/core: inventory
      name: infrastructure-vsphere
      namespace: capv-system
      resourceVersion: "2544"
      uid: 0b616810-b6b5-4907-ada7-0e09832ca778
    providerName: vsphere
    type: InfrastructureProvider
    version: v1.5.3