TKR 1.28.8 Upgrade Fails: Photon OSImage Not Resolved
search cancel

TKR 1.28.8 Upgrade Fails: Photon OSImage Not Resolved

book

Article ID: 404403

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime

Issue/Introduction

Attempting to upgrade a Tanzu Kubernetes Cluster (TKC) to Tanzu Kubernetes Release (TKR) version 1.28.8 resulted in failure. Although the cluster was edited successfully to reference the new version, neither control plane nor worker nodes rolled out. This caused the upgrade process to stall.

The following log message from the tkr-status-resolver pod on the supervisor confirmed that the issue was related to OSImage resolution:

could not resolve TKR/OSImage for controlPlane, machineDeployments: (workers), query: (controlPlane: (k8sVersionPrefix: ‘v1.28.8+vmware.1-fips.1-tkg.2’, tkrSelector: ‘!run.tanzu.vmware.com/legacy-tkx,tkr.tanzu.vmware.com/standard’, osImageSelector: ‘os-name-photon’)

 

 

Environment

Tanzu Kubernetes Runtime

Cause

TKR 1.28.8 introduced a new structure allowing a single TKR to reference multiple OSImages (e.g., Photon and Ubuntu). In this particular case, the TKR object for version 1.28.8 was missing a reference to the Photon OSImage.

As a result, the cluster—expecting to use a Photon-based OSImage (the default)—could not resolve a suitable image to create new nodes. The absence of this reference blocked the rollout.

Resolution

To resolve the issue:

  1. Locate the desired TKR:
    • kubectl get tkr | grep <tkr version>
  2. Delete the desired TKR:
    • kubectl delete tkr <tkr version>
  3. Confirm that the desired TKR recreated after a few minutes:
    • kubectl get tkr | grep <tkr version>
  4. If the TKR does not sync back after deleing it then restart the vmware-system-tkg-controller-manager pod:
    • kubectl delete pod vmware-system-tkg-controller-manager-<guid> -n <namespace>
  5. Check that the osimage for the TKR version is created for each operating system a few minutes after the TKR was recreated:
    • kubectl get osimage | grep tkr version
  6. The cluster should successfully begin rolling out new control plane and worker nodes, completing the upgrade process.