Unable to upgrade TKG Clusters due to error "version cannot be decreased from"
search cancel

Unable to upgrade TKG Clusters due to error "version cannot be decreased from"

book

Article ID: 381348

calendar_today

Updated On:

Products

Tanzu Kubernetes Grid

Issue/Introduction

  • When upgrading a TKG cluster the upgrade will pause and show no progress, eventually failing.
  • Checking the cluster will sometimes also show it in a paused state, attempts to unpause it or resume the upgrade will give an error:

The Cluster "xxxxxxxxx" is invalid: spec.topology.version: Invalid value: "v1.25.7+vmware.2": version cannot be decreased from "1.25.7+vmware.2-tkg.1" to "1.25.7+vmware.2"

Environment

TKG 2.2

Cause

Due to a defect within TKG 2.2 the versioning can become malformed with the "-tkg.1" being appended to the end

Resolution

This is resolved in TKG 2.3

To recover from this while on 2.2:

  1. Pause the cluster

    kubectl patch cluster -n xxxxxxx xxxxxxxx --type merge \ -p '{"spec":{"paused": true}}'

  2. Temporarily disable the validatingwebhookconfiguration capi-validating-webhook-configuration for cluster, backup the existing and delete it

    kubectl get validatingwebhookconfiguration capi-validating-webhook-configuration -oyaml > capi-webhook-backup.yaml
    kubectl delete validatingwebhookconfiguration capi-validating-webhook-configuration

  3. Correct the cluster's spec.topology.version to remove the "-tkg.1" such as  change it to "v1.25.7+vmware.2" from "1.25.7+vmware.2-tkg.1".

    kubectl edit cluster -n xxxxxxx xxxxxxxx

  4. Enable the webhook again

    kubectl create -f capi-webhook-backup.yaml (file from step 2)

  5. Unpause the cluster

    kubectl patch cluster -n xxxxxxx xxxxxxxx --type merge \ -p '{"spec":{"paused": false}}'