1. The supervisor cluster status shows "configuring" in vSphere UI under workload management.
2. All the pods and service inside the supervisor are up and running.
3. On running the command "kubectl get ns", the following namespaces: vmware-system-capw, vmware-system-pkgs, vmware-system-tkg and vmware-system-ucs are stuck in "terminating" state.
4. No finalizers are associated to any API resources associated with these namespaces and neither any resource is stuck in "deleting".
5. No HTTP or HTTPS proxy is configured for the supervisor or the vCenter server itself.
6. On describing the namespace stuck in "terminating" it complains of not being able to call out webhooks as below
NamespaceDeletionContentFailure True ContentDeletionFailed Failed to delete all resource types, 4 remaining: Internal error occurred: failed calling webhook "capi.validating.tanzukubernetescluster.run.tanzu.vmware.com": failed to call webhook: Post "https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/capi-validate?timeout=10s": service "vmware-system-tkg-webhook-service" not found, Internal error occurred: failed calling webhook "capi.validating.tanzukubernetescluster.run.tanzu.vmware.com": failed to call webhook: Post "https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/capi-validate?timeout=10s": service "vmware-system-tkg-webhook-service" not found, Internal error occurred: failed calling webhook "capi.validating.tanzukubernetescluster.run.tanzu.vmware.com": failed to call webhook: Post "https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/capi-validate?timeout=10s": service "vmware-system-tkg-webhook-service" not found, Internal error occurred: failed calling webhook "utkg.clusterclass.validating.clusterclass.run.tanzu.vmware.com": failed to call webhook: Post "https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/utkg-clusterclass-validate-cluster-x-k8s-io-v1beta1-clusterclass?timeout=10s": service "vmware-system-tkg-webhook-service" not found.
7. Per wcp, it is unable to see the supervisor service package in packageInstalls which stalls the upgrade and the supervisor stays stuck in "configuring"
error wcp [controller/core_service_controller.go:585] [opID=CoreServiceController] error registering core services: error creating spec for registering core service 'sample-pkg.test.carvel.dev': open /etc/vmware/wcp/supervisorservices/packages/sample-pkg.test.carvel.dev-1.0.0.yaml: no such file or directory
info wcp [controller/core_service_controller.go:141] [opID=CoreServiceController] Reconciling core services on all Supervisors
debug wcp [kubelifecycle/kube_instance_grouped_conditions.go:351] [opID=CoreServiceController] No Core Services found to set condition.
debug wcp [controller/image_registry_controller.go:80] [opID=ContainerImageRegistryController] synchronizing Container Image Registries to all Supervisors
debug wcp [logger/trace.go:77] [opID=ContainerImageRegistryController] [BEGIN] [supervisor/controller.(*ContainerImageRegistryController).syncImageRegistriesToSupervisors:130] synchronizing Container Image Registries onto Supervisor clusters
vSphere with Tanzu 8.0
VMware vCenter server 8.0.x
During RDU upgrades of vCenter, the configuration files on VCSA from the source vCenter get copied onto the new target vCenter, overwriting the correct values from core-services.json and the Supervisor Services allow-list.txt. As a result, the wcpsvc is repeatedly searching for "sample-pkg.test.carvel.dev" which is an unreleased ID present as a placeholder in previous VC releases prior to the Core Services feature being enabled.
There is no permanent fix for now. Meanwhile the following workaround is available.
1. Overwrite all the files and folders inside the /etc/vmware/wcp/supervisorservices folder in the existing vCenter server with the files and folders from the target VC environment.
2. Overwrite the file /etc/vmware/wcp/supervisor-services-allow-list.txt in the existing vCenter server with the one from the target VC environment.
Post this restart the wcp service on the affected vCenter server.