In air-gapped VMware Cloud Foundation or vSphere environments, upgrading the vSphere Kubernetes Service may result in a permanent reconciliation deadlock. This occurs if the service bundles were relocated to a private registry using docker pull/push or imgpkg pull/push instead of the recursive imgpkg copy method.
Symptoms include:
The svc-tkg Supervisor Service remains in a "Reconciling" or "Degraded" state indefinitely.
Sub-packages fail to pull images, referencing public registry URLs (e.g., projects.registry.vmware.com) despite the environment being isolated.
The kapp-controller fails to process corrected configuration changes even after the image registry paths are remediated.
VMware Cloud Foundation
vSphere Kubernetes Service
Air-gapped / Disconnected deployments
This issue is caused by the kapp-controller reconciliation prioritization. When a PackageInstall specification is not modified (e.g., the version remains the same), the controller marks it as a "no-op". In an air-gapped environment where sub-packages contain stale public registry references, these "no-op" reconciliation attempts fail. Because the controller prioritizes these tasks in its first reconciliation batch, the updated Package data (containing the fix) is never processed, resulting in an infinite loop.
To resolve this deadlock condition, the execution engine must be forced to evaluate the child components via metadata modifications or by using the integrated platform management script.
Execute the following command to stop the current reconciliation attempts:
kubectl patch pkgi/svc-tkg.vsphere.vmware.com \
--namespace vmware-system-supervisor-services \
--type='json' \
--patch='[{"op": "add", "path": "/spec/paused", "value":true}]'
Identify the <Domain Namespace> (e.g., svc-tkg-domain-c####) and apply a dummy label to all child packages to force a state change:
kubectl patch pkgi -n <Domain Namespace> --type='merge' --patch '{"metadata":{"labels":{"fix-trigger":"true"}}}' tanzu-addons-manager
kubectl patch pkgi -n <Domain Namespace> --type='merge' --patch '{"metadata":{"labels":{"fix-trigger":"true"}}}' tkg-controller
# Repeat for all child packages: tanzu-cliplugins, tanzu-cluster-api, runtime-extension, tkr-service, etc.
Unpause the parent PackageInstall:
kubectl patch pkgi/svc-tkg.vsphere.vmware.com \
--namespace vmware-system-supervisor-services \
--type='json' \
--patch='[{"op": "remove", "path": "/spec/paused"}]'
Toggle the paused status of the App resource to trigger an immediate update:
kubectl patch apps/svc-tkg.vsphere.vmware.com -n vmware-system-supervisor-services --type='json' -p='[{"op": "add", "path": "/spec/paused", "value":true}]'
kubectl patch apps/svc-tkg.vsphere.vmware.com -n vmware-system-supervisor-services --type='json' -p='[{"op": "remove", "path": "/spec/paused"}]'
Monitor the status until the DESCRIPTION field no longer shows "Reconciling":
kubectl get app svc-tkg.vsphere.vmware.com -n vmware-system-supervisor-serviceLog into the Supervisor control plane console via an SSH session.
Execute the integrated package override script to adjust the image reference endpoints:
/usr/lib/vmware-wcp/override-package-image.sh \
-p "tkg.vsphere.vmware.com.3.5.1-embedded+v1.34" \
-i "my-registry.example.com/tkg-svs/package/tkg-service:3.5.1"
The execution utility provides the following configuration parameters:
-p, --package-name: Package target identifier (e.g., tkg.vsphere.vmware.com.3.5.1-embedded+v1.34)
-i, --image: The corrected target registry URL destination
-n, --namespace: The target operational namespace (defaults to vmware-system-supervisor-services)
--dry-run: Validates and reflects changes visually without writing modifications
-h, --help: Displays the internal utility usage information
Always use imgpkg copy for air-gapped bundle relocation to ensure all nested image references are correctly updated to the target private registry.