Resolving "no available upgrades for cluster" during upgrade of legacy TKGm cluster

search cancel

Resolving "no available upgrades for cluster" during upgrade of legacy TKGm cluster

book

Article ID: 374451

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

When an upgrade is interrupted due to an issue with infrastructure or other factors, the upgrade may stall. When you examine the cluster object, you'll notice that it is labeled with the upgraded version; however, the nodes in the cluster are still running the previous Tanzu Kubernetes Release.

tanzu cluster list -o yaml

- name: clustertest
namespace: default
status: running
plane: prod
controlplane: 1/1
workers 3/3
kubernetes: v1.28.7---vmware.1
roles: []
labels:
tanzuKubernetesRelease: v1.28.7---vmware.1
tkg.tanzu.vmware.com/cluster-name: clustertest

kubectl get nodes -owide

NAME STATUS ROLES AGE VERSION
clustertest-control-plane-1 Ready control-plane 10d v1.28.3---vmware.1
clustertest-worker-1 Ready <none> 10d v1.28.3---vmware.1
clustertest-worker-2 Ready <none> 10d v1.28.3---vmware.1
clustertest-worker-3 Ready <none> 10d v1.28.3---vmware.1

Cause

This issue can occur when a TKGm upgrade is interrupted during the process. When this happens, the cluster may enter an 'upgradeStalled' status.

Resolution

Once the root cause of the interruption has been resolved, a potential workaround involves updating the cluster object to reflect the current version that the nodes are running. This should allow you to trigger the upgrade process again.

1). From the management cluster context, list cluster objects:

kubectl get cluster

2). Take a backup of the cluster object before making any changes:

kubectl get cluster clustertest -o yaml > clusterObjectBackup.yaml

3). Edit the cluster object and update labels.tanzuKubernetesRelease: to the TKR version that the nodes are currently running (i.e., the version the nodes were using prior to the attempted upgrade):

kubectl edit cluster clustertest

4). After updating the cluster object, you should be able to trigger the upgrade process again.

Feedback

thumb_up Yes

thumb_down No