According to the release notes, Upgrading the VKS cluster to VKr v1.29.4 from v1.28.15 is not supported. This would result in a back in time upgrade for some of the packages and the fixes available in v1.28.15 patch will not be available in 1.29.4.
If you have already initiated an upgrade from v1.28.15 to v1.29.4 but did not manually unpause the cluster, follow vSphere 8.0 Supervisor Workload Cluster Upgrade Stuck with No Nodes on Desired Upgraded Version to revert the cluster back to v1.28.15.
This KB is to cover the scenario where a user manually unpaused the cluster, which leads to a v1.29.4 controlplane rollout that gets stuck in NotReady status.
On the supervisor, both the API server and the tanzu-addons-controller-manager pod show logs similar to the one below, indicating a kapp-controller downgrade error.
ClusterBootstrap.run.tanzu.vmware.com <cluster name> is invalid: spec.kapp.refName: Invalid value: \"kapp-controller.tanzu.vmware.com.0.50.0+vmware.1-tkg.1-vmware\": package downgrade is not allowed, original version: 0.50.0+vmware.2-tkg.1-vmware, updated version 0.50.0+vmware.1-tkg.1-vmware
But if you tried follow vSphere 8.0 Supervisor Workload Cluster Upgrade Stuck with No Nodes on Desired Upgraded Version to revert the cluster back, or other actions which involves the deletion of validatingwebhookconfiguration clusterbootstrap-validating-webhook-configuration, The error may no longer appear in the tanzu-addons-controller-manager pod, but instead, the kapp-controller PackageInstall <cluster name>-kapp-controller may start reporting the following error.
status:
conditions:
- message: Error (see .status.usefulErrorMessage for details)
status: "True"
type: ReconcileFailed
friendlyDescription: 'Reconcile failed: Error (see .status.usefulErrorMessage for
details)'
lastAttemptedVersion: 0.50.0+vmware.2-tkg.1-vmware
observedGeneration: 2
usefulErrorMessage: |-
Stopped installing matched version '0.50.0+vmware.1-tkg.1-vmware' since last attempted version '0.50.0+vmware.2-tkg.1-vmware' is higher.
hint: Add annotation packaging.carvel.dev/downgradable: "" to PackageInstall to proceed with downgrade
version: 0.50.0+vmware.1-tkg.1-vmware
VKS v3.3.2 and below
Upgrading VKS Cluster to v1.29.4 effectively results in downgrading of the kapp-controller package, because the previous version (v1.28.15) was released after v1.29.4. During the upgrade to v1.29.4, the addon-manager paused the cluster to upgrade the addons but failed after detecting the version downgrade. As a result, the cluster remains in a paused state.
If the cluster is manually unpaused, it causes the upgrade of the cluster to proceed with v1.29.4 without upgrading the corresponding kapp-controller addons, leading to a failure since the non-upgraded kapp-controller addon is incompatible with the upgraded node(s).
Please contact Broadcom support for assistance.
This type of “back-in-time upgrade” should not occur in VKS 3.3.3 or later, as a webhook constraint was introduced to prevent it.