After upgrading from TMC Self-Managed version 1.4.0 to 1.4.1, guest clusters may fail to reconcile TMC components and extensions, resulting in pods entering a ImagePullBackOff state in the vmware-system-tmc namespace.
This issue affects extensions such as:
The problem arises when the upgraded system attempts to pull container images for these extensions but fails due to incorrect or outdated image registry paths.
Symptoms:
kubectl get pods -n vmware-system-tmc shows ImagePullBackOff status for multiple podskubectl describe pod <name> reveals failed image pulls from:
harbor.<domain>:8443/tmc/498533941640.dkr.ecr.us-west-2.amazonaws.com/...tap-tmc-docker-virtual.usw1.packages.broadcom.com ) is not being used consistentlyVMware Tanzu Mission Control - SM (VMware Tanzu Mission Control - SM)
TMC Self-Managed 1.4.1 introduced a change to the default registry path for extension images, migrating from:
498533941640.dkr.ecr.us-west-2.amazonaws.com
to:
tap-tmc-docker-virtual.usw1.
However, during the upgrade process from version 1.4.0 to 1.4.1, not all components or extension metadata were updated to reflect this new registry path. As a result:
This manifests as ImagePullBackOff for affected pods, and partial failures across extension lifecycle management.
This issue is resolved in TMC Self-Managed 1.4.2. Customers are encouraged to upgrade to 1.4.2 to permanently avoid this behavior.
For environments still running 1.4.1 and experiencing this issue, the following workaround may be applied:
Restart the cluster-agent-service-server deployment in the tmc-local namespace to trigger extension metadata reconciliation and correct the registry references.
Run:
kubectl -n tmc-local rollout restart deploy cluster-agent-service-server
This forces the cluster-agent-service to refresh the extension image registry paths and resync with the appropriate container image references. Within a few minutes (5-10), guest cluster components should begin pulling images from the correct registry, and failed pods should recover.
Note:
This workaround does not delete or reset cluster state. It only triggers reconciliation logic that corrects stale registry path entries in the service’s internal metadata.
Querying the extension registry mappings in the cluster-agent-service database reveals mixed image registry references across extensions within the same cluster
To confirm, run the following SQL query against the cluster-agent-service database:
Connect to db:
psql $(kubectl -n tmc-local get secrets cluster-agent-postgres-creds -o json | jq -r '.data.PGURL | @base64d | sub("postgres-postgresql"; "127.0.0.1") | sub("5432"; "15432")')
Run the following query:
select name, cluster_name, image_registry from extension;
You may observe a mix of:
harbor.<domain>:8443/tmc/498533941640.dkr.ecr.us-west-2.amazonaws.comharbor.<domain>:8443/tmc/tap-tmc-docker-virtual.usw1.packages.broadcom.com