vSphere Tanzu Supervisor Cluster showing Health and Status Unknown in Tanzu Mission Control UI (Administration - Management cluster menu)
search cancel

vSphere Tanzu Supervisor Cluster showing Health and Status Unknown in Tanzu Mission Control UI (Administration - Management cluster menu)

book

Article ID: 399088

calendar_today

Updated On:

Products

VMware Tanzu Mission Control

Issue/Introduction

In the TMC UI when viewing the Administration - Management cluster menu, the registered Supervisor Cluster has Health and Status of Unknown.

The component health controller-manager and scheduler were showing down.

In the vCenter the Supervisor - TMC menu is showing the registration as successful.

From the Supervisor Cluster context running kubectl get pods -n <svc-tmc-cX> where <svc-tmc-cX> is the TMC namespace shows that agent-updater , agent-installer, extension-updater, extension-installer and tmc-agent-installer but no more.

Environment

Tanzu Mission Control

vSphere with Tanzu

Cause

Supervisor registration with TMC while reporting as complete did not complete successfully
The registration URL had expired and a new registration URL was required to reregister the Supervisor cluster with TMC.

Resolution

Note - do not alter or delete any resources associated to TMC ie under the svc-tmc-cX namespace, for example do not delete the svc-tmc-cX namespace from the Supervisor cluster.

Please reach out to support for help with reregistering the Supervisor Cluster.

They will take you through the below procedure.

 - Remove the AgentInstall for TMC
 - Uninstall the AgentInstall for TMC
 - Check the AgentConfig yaml has correct certificate and pull / update AgentConfig yaml if an issue
 - Sync with Engineering to obtain a new registration URL for Supervisor reregistration in TMC and update the AgentInstall yaml
 - Reapply the AgentConfig
 - Reregister the Supervisor with TMC
 - Reapply the AgentInstall
 - Monitor TMC pods and UI until Health and Status change from Unknown to Healthy.

After completion of this the Supervisor cluster should be of Status Healthy.




Additional Information

Remove the AgentInstall for TMC

 - delete the agentinstall tmc-agent-installer-config
 kubectl delete -n svc-tmc-c5589 agentinstall tmc-agent-installer-config

Uninstall the AgentInstall for TMC

 - uninstalling the agentinstall
 apiVersion: installers.tmc.cloud.vmware.com/v1alpha1
 kind: AgentInstall
 metadata:
   name: tmc-agent-installer-config
   namespace: svc-tmc-c5589
 spec:
   operation: UNINSTALL


Check the AgentConfig yaml has correct certificate and pull / update AgentConfig yaml if an issue

 - pulled the certificate for the agentconfig yaml and updated the yaml
 openssl s_client -connect $DOMAIN_NAME:443 -showcerts </dev/null 2>/dev/null | openssl x509 -outform PEM
 -- where $DOMAIN_NAME was the FQDN of the TMC

Sync with Engineering to obtain a new registration URL for Supervisor reregistration in TMC and update the AgentInstall yaml

 - a new registration url for supervisor registration was created via Engineering and the agentinstall yaml updated
 apiVersion: installers.tmc.cloud.vmware.com/v1alpha1
 kind: AgentInstall
 metadata:
   name: tmc-agent-installer-config
   namespace: svc-tmc-c5589
 spec:
   operation: INSTALL
   registrationLink: $REGISTRATION_URL
 -- where $REGISTRATION_URL is the changed registration url

Reapply the AgentConfig

 - then the agentconfig with kubectl apply -f agentconfig.yaml

Reregister the Supervisor with TMC

 - tanzu tmc mc reregister <supervisor-name> --kubeconfig <KUBECONFIG>
 -- where <supervisor-name> was replaced with the supervisor's name and <KUBECONFIG> was replaced wit hthe kubeconfigs location.

Reapply the AgentInstall

 - then the agentinstall with kubectl apply -f agentinstall.yaml

Monitor TMC pods and UI until Health and Status change from Unknown to Healthy.

 - Finally we monitored the pods in the tmc-svc-c5589 namespace and monitored the TMC UI via the Administration - Management cluster and clicking into the Supervisor cluster and found the supervisor status to change from unknown state to healthy state fixing the status of the supervisor cluster in the TMC UI.