The article explans the TKG plugin functionality and how to renew its TLS certificate when the download plug-in task fails with a 502 error.
Symptoms:
- Download Plug-in tasks keep failing in the vCenter with this error:
- Viewing any of the options in vCenter UI > Cluster > Configure > TKG Service return the below error "502 Bad Gateway":
The TKGS plugin provides visibility and configuration from the vCenter UI under Hosts and Clusters > Cluster > Configure > TKG Service, for setting the Default Tanzu Kubernetes cluster CNI plugin and registering clusters in Tanzu Mission Control. It communicates with tkgs-plugin-server pod in the backend through a masterproxy-tkgs-plugin pod, which acts as a reverse proxy to ensure that calls from that TKG interface in the vSphere client are properly routed to the tkgs-plugin-server.
When the TLS certificate in tkgs-plugin-tls-secret expires, this communication fails with status code 502 Bad Gateway
One or more of the masterproxy-tkgs-plugin pods logs should report the following error:
kubectl get pods -A | grep "tkgs-plugin"
kubectl logs -n <tkgs-plugin-namespace> <masterproxy-tkgs-plugin-pod-name>
YYYY-MM-DDTHH:MM:SS.ssssssZ stderr F YYYY/MM/DD HH:MM:SS [error] 8#0: *5167 upstream SSL certificate verify error: (10: certificate has expired) while SSL handshaking to upstream, client: 127.0.0.1, server: localhost, request: "GET /plugin.json HTTP/1.0", upstream: "https://<tkgs-plugin-service-IP-address>:<tkgs-plugin-port>/plugin.json"
Verify that the tkgs-plugin-tls-secret shows an expired TLS certificate:
If you do not have permissions to view secrets in the Supervisor cluster, you will need to connect directly into a Supervisor control plane VM as per the below KB:
Troubleshooting vSphere with Tanzu (TKGS) Supervisor Control Plane VMskubectl get secret -A | grep "tkgs-plugin-tls-secret"
kubectl get secret -n <plugin namespace> tkgs-plugin-tls-secret -o jsonpath='{.data.tls\.crt}' |base64 -d |openssl x509 -noout -text |grep After
Not After: MON DD HH:MM:SS YYYY GMT
kubectl replace -f /usr/lib/vmware-wcp/objects/PodVM-GuestCluster/13-tkgs-plugin/tkgs-plugin-deployment.yaml --force
Note: If the tkgs-plugin-deployment.yaml is not found in the above path , run find / -type f -name "*.yaml" 2>/dev/null | grep tkgs-plugin to find the absolute path for the file and perform the renew operation. In certain cases, you need to try this 'find' command from all SPVMs to find the file.
kubectl get secret -n <plugin namespace> tkgs-plugin-tls-secret -o jsonpath='{.data.tls\.crt}' |base64 -d |openssl x509 -noout -text |grep After
Not After: MON DD HH:MM:SS YYYY GMT
kubectl get pods -A | grep "tkgs-plugin"
masterproxy-tkgs-plugin-<A> 1/1 Running
masterproxy-tkgs-plugin-<B> 1/1 Running
masterproxy-tkgs-plugin-<C> 1/1 Running
tkgs-plugin-server<A-A> 1/1 Running
tkgs-plugin-server<B-B> 1/1 Running
kubectl get deploy -A | grep "tkgs-plugin-server"
NAME READY UP-TO-DATE AVAILABLE
tkgs-plugin-server 2/2 2 2