After successfully authenticating to a TKGS cluster, kubectl commands fail with the error, "You must be logged in to the server (Unauthorized)"
search cancel

After successfully authenticating to a TKGS cluster, kubectl commands fail with the error, "You must be logged in to the server (Unauthorized)"

book

Article ID: 372806

calendar_today

Updated On:

Products

vSphere with Tanzu VMware vSphere Kubernetes Service

Issue/Introduction

  • Even after successfully authenticating to a TKGS cluster 'kubectl vsphere login' , kubectl command fail with an Unauthorized error.
    $ kubectl get pod -A
    error: You must be logged in to the server (Unauthorized)
  • The guest-cluster-auth-svc pod’s log on the TKC is reporting the error message below:
    E0607 15:43:21.807301       1 token_review_endpoint.go:94] Invalid token: failed to validate JWT

Environment

vSphere with Tanzu

Cause

The unauthorized error occurs due to a known issue, where a guest cluster and its supervisor cluster have not synced changes in vCenter Server public keys.

This issue is known to occur after renewing vCenter certificates or if the vCenter public keys change.

Resolution

Fix:
The fix is included in TKr 1.31.1 and higher versions. 

Workaround:

The vCenter Server public keys are stored in the configmap “vc-public-keys” on the supervisor cluster. These keys are synced to the configmap “guest-cluster-auth-svc-public-keys” on the TKGS cluster(TKC). 
The guest-cluster-auth-svc pods need to be restarted to update the changes in the configmap(ie. vc public keys, tls server certificate, or tls private key), “guest-cluster-auth-svc-public-keys”.

So, restart all the guest-cluster-auth-svc pods on the TKGS cluster.
You can also delete the existing guest-cluster-auth-svc pods by following the below steps. The deleted pods will be re-created with new keys.

  • SSH to the Control Plane node of the guest-cluster using KB - Accessing vSphere with Tanzu workload clusters using SSH and export the kubeconfig admin file by running this command: export KUBECONFIG=/etc/kubernetes/admin.conf
  • Run this command to list the auth pods: kubectl get pods -A | grep cluster-auth -w 
  • Delete the pod with this command: kubectl delete pod -n vmware-system-auth guest-cluster-auth-svc-xxxx
  • Wait a few moments for the pod to be recreated. Verify the list of auth pods: kubectl get pods -A | grep cluster-auth -w
  • Now you should be able to run the kubectl commands 

Additional Information

It is known below condition also shows the same symptom:

  • Guest cluster's auth certificate(guest-cluster-auth-svc-key) is renewed by cert-manager - if a user manually deletes the issuer resource '<cluster-name>-extensions-ca-issuer' or the secret resource '<cluster-name>-auth-svc-cert' in Supervisor, tkg-controller will recreate them. 

How to access the guest-cluster-auth-svc pods on the TKC:

  • SSH to the TKC's control plane node or
  • Retrieve the TKC's kubeconfig information from the secret of "<cluster-name>-kubeconfig" from the Supervisor Cluster