Kapp package fails to update/install with: Stopped installing matched version
search cancel

Kapp package fails to update/install with: Stopped installing matched version

book

Article ID: 439376

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime

Issue/Introduction

When attempting to revert or downgrade core components in a vSphere Kubernetes Service (VKS) Guest Cluster, the PackageInstall (e.g., tkgs-npr-main-kapp-controller) fails to reconcile with the following error:

Stopped installing matched version '0.50.0+vmware.1-tkg.3-vmware' since last attempted version '0.55.0+vmware.1-fips-tkg.1' is higher.
hint: Add annotation packaging.carvel.dev/downgradable: "" to PackageInstall to proceed with downgrade

After applying the required packaging.carvel.dev/downgradable annotation to bypass the version check, the reconciliation remains stuck, eventually timing out with:

kapp: Error: Timed out waiting after 30s for resources: apiservice/v1alpha1.data.packaging.carvel.dev (apiregistration.k8s.io/v1) cluster
deployment/kapp-controller (apps/v1) namespace: tkg-system


In this state, the kapp-controller deployment in the Guest Cluster shows 0/1 replicas available, and no pods are present in the tkg-system namespace.

Environment

VMware vSphere Kubernetes Service

Cause

The reconciliation deadlock is caused by a two-stage failure:

  1. Version Protection: Carvel kapp-controller prevents downgrades by default to avoid accidental state corruption, requiring a manual annotation.

  2. Admission Controller Blockade: Once the downgrade is forced, the kapp-controller deployment attempts to rotate pods. However, a Kyverno ClusterPolicy (e.g., check-image-registry) intended to restrict image pulls to corporate registries contains a syntax error in its exclusion list. The pattern tkg-system-* fails to match the literal tkg-system namespace.

Consequently, the Kyverno webhook denies the creation of the new kapp-controller pods because they attempt to pull images from the internal local registry (localhost:5000). Without running pods, the v1alpha1.data.packaging.carvel.dev APIService cannot initialize, leading to the 30-second timeout.

Resolution

1. Enable Downgrade (Supervisor Context)
Annotate the PackageInstall on the Supervisor Cluster to permit the version rollback

2. Correct Kyverno Policy on the guest cluster
kubectl edit clusterpolicy <policy-name>
For webhooks that prevent image pulls and pod creation based on given namespaces, allow the following namespaces that are integral to VKS cluster lifecycle events:
kube-system
vmware-system-antrea
vmware-system-auth
vmware-system-cloud-provider
vmware-system-csi
tkg-system
secretgen-controller
vmware-system-supervisor-services
The namespace that houses VKS components which is unique to each environment and can be retrieved with:

 kubectl get ns | grep svc-tkg

3. Reset Deployment State (Guest Cluster)
kubectl rollout restart deployment <deployment-name> -n <namespace>