Supervisor Cluster Stuck in “Configuring” State and SupervisorControlPlaneVM in "to be determined" State.

Products

VMware vSphere Kubernetes Service VMware vCenter Server

Issue/Introduction

A vSphere with Tanzu Supervisor Cluster remains in "Configuring" state in the vCenter UI, and one or more SupervisorControlPlaneVMs appear as “to be determined.”
During this condition, Tanzu control-plane services could not complete initialization, putting Supervisor and workload management operations at risk.

/var/log/vmware/wcp/wcpsvc.log on vCenter reports repeated errors when attempting to call the local identity API endpoint. WCP fails to call the vCenter identity endpoint via the local REST proxy.

error wcp [...] Error fulfilling request ... GET http://localhost:1080/rest/vcenter/identity/vc-identity: 503 Service Unavailable

Error validating if cluster is running on cloud: GET http://localhost:1080/rest/vcenter/identity/vc-identity: 503 Service Unavailable

Administrators attempting to log in to a vSphere with Tanzu Kubernetes guest cluster using kubectl vsphere login receive the error:

Error while getting list of workloads: bad gateway
Please contact your vSphere server administrator for assistance.

The login command targets a VKS guest cluster using a vCenter SSO account, but the client is unable to retrieve the list of workloads from the Supervisor, resulting in a 502 Bad Gateway response.
wcp-authproxy logs on Supervisor:
- The WCP authproxy successfully authenticates the user, but fails when calling the vSphere Namespaces VAPI.

INFO:vclib.sso:[...] Got bearer token for [email protected].
INFO:vclib.sso:[...] Got hok token for /etc/vmware/wcp/tls/wcpusr.cert.
INFO:auth.filters:[...] User authenticated using basic token.

DEBUG:vmware.vapi.bindings.stub:opId: wcp-authproxy-... invoke:
interface_id: com.vmware.vcenter.namespaces.user.instances, operation_name: list

ERROR:vclib.wcp:[...] WCP request failed.
com.vmware.vapi.std.errors_client.InternalServerError: {... error_type : INTERNAL_SERVER_ERROR}
ERROR:wcp.resources:[...] VAPI request failed.
INFO:server:[...] "GET /wcp/workloads HTTP/1.0" 502 46 "-" "kube-plugin-vsphere ..."

/var/og/vmware/vapi/endpoint.log on vCenter:
- The vAPI layer reports no healthy upstream for the vc_identity service:

ERROR | SessionFacade | ... com.vmware.vcenter.identity.vc_identity.get
com.vmware.vapi.client.exception.TransportProtocolException:
HTTP response with status code 503: no healthy upstream

Environment

VMware vSphere Kubernetes Service
VMware vCenter Server
VCF 9.0.1

Cause

The "vmware-trustmanagement" service on the vCenter Server backs the identity and trust endpoints used by vAPI and WCP, including the /rest/vcenter/identity/vc-identity API. When it is not running, calls to vc-identity through the local proxy (localhost:1080) return HTTP 503 Service Unavailable with “no healthy upstream.”

As WCP relies on this endpoint to determine vCenter identity and environment, the failure bubbles up as internal errors to the Namespaces API. The WCP authproxy then returns a 502 Bad Gateway to "kubectl vsphere", resulting in the user-facing “bad gateway” error during guest cluster login.
These errors indicated that WCP could not retrieve vCenter identity information via the internal REST proxy.
The "vmware-trustmanagement" service on vCenter Server is found to be in stopped state.
- This service exposes identity and trust-related APIs that WCP relies on to complete its compatibility and environment checks
- With vmware-trustmanagement down, calls to /rest/vcenter/identity/vc-identity returned HTTP 503, causing WCP reconciliation to fail and leaving the Supervisor in Configuring state.

Resolution

Verify vCenter Services
- On the vCenter Server Appliance, list service status:
  
  service-control --status --all
- Confirm that "vmware-trustmanagement" and other core services (such as vpxd, vapi-endpoint, sts, etc.) are running.
If "vmware-trustmanagement" is stopped, start it:

service-control --start vmware-trustmanagement
In environments where multiple services show as degraded, use:

service-control --start --all
Validate WCP and Supervisor Status
- Monitor "wcpsvc" logs for the absence of "503 Service Unavailable" errors against "vc-identity".
- In the vCenter UI, check that the Supervisor Cluster transitions from Configuring to Running and that SupervisorControlPlaneVMs show a healthy state.

Additional Information

Similar 503 “no healthy upstream” or “bad gateway” errors for Supervisor or VKS logins often trace back to core vCenter services being stopped or unhealthy, especially vAPI and trust/identity services.
Always check vCenter services with "service-control --status --all" when encountering widespread 503/502 errors from WCP or "kubectl vsphere".
Ensure vCenter certificates and SSO configuration are healthy if 503 errors persist after services are restarted.
Service outages affecting internal REST endpoints can cause similar “Configuring” or “Not Ready” states for Supervisors and require verification of vCenter service health.
If the issue recurs, review recent changes (patches, restarts, certificate or SSO updates) that might impact trust or identity services.