vCenter task "Download plug-in" keeps failing for 'VMware TKG plugin' with status code 502
book
Article ID: 323416
calendar_today
Updated On:
Products
VMware vSphere ESXiVMware vSphere with Tanzu
Issue/Introduction
The KB should shed some light on the TKG plugin functionality and help renew its TLS certificate.
Symptoms: - Download Plug-in tasks keep failing in the vCenter with this error:
Error downloading plug-in. URL is unreachable. org.apache.http.client.HttpResponseException: status code: 502, reason phrase: Bad Gateway org.apache.http.impl.client.AbstractResponseHandler.handleResponse(AbstractResponseHandler.java:70)
- Going to any of the options in vCenter UI > Cluster > Configure > TKG Service fails with error "Bad Gateway 502".
Environment
VMware vSphere 7.0 with Tanzu
Cause
The TKG plugin should help configuring some options from the vCenter UI under Hosts and Clusters > Cluster > Configure > TKG Service, such as setting the Default Tanzu Kubernetes cluster CNI plugin and register clusters in Tanzu Mission Control. It communicates with tkgs-plugin-server pod in the backend through a masterproxy-tkgs-plugin pod, which acts as a reverse proxy to ensure that calls from that TKG interface in the vSphere client are properly routed to the tkgs-plugin-server.
When the TLS certificate in tkgs-plugin-tls-secret expires, this communication fails with status code 502 Bad Gateway. The masterproxy-tkgs-plugin logs should report this error:
2023-03-01T11:45:30.846010245Z stderr F 2023/02/08 11:45:30 [error] 8#0: *5167 upstream SSL certificate verify error: (10: certificate has expired) while SSL handshaking to upstream, client: 127.0.0.1, server: localhost, request: "GET /plugin.json HTTP/1.0", upstream: "https://<ip-address>:8099/plugin.json", host: "127.0.0.1:9900"
The tkgs-plugin-tls-secret should contain an expired TLS certificate:
2- Check the expiry date of the TLS certificate for TKG plugin: # kubectl get secret -n vmware-system-tkg tkgs-plugin-tls-secret -o jsonpath='{.data.tls\.crt}' |base64 -d |openssl x509 -noout -text |grep After Not After: Dec 20 11:12:12 2022 GMT <---- expired
Resolution
The certificate is generated as part of the TKG plugin deployment. A force replace for that deployment should help with replacing the certificate:
1- Access the supervisor cluster as instructed above. 2- Go to the directory in which TKGs plugin deployment is stored: # cd /usr/lib/vmware-wcp/objects/PodVM-GuestCluster/13-tkgs-plugin
3- Force replace the deployment of the TKG plugin using the definition file: # kubectl replace -f tkgs-plugin-deployment.yaml --force
4- Ensure the certificate is now rotated: # kubectl get secret -n vmware-system-tkg tkgs-plugin-tls-secret -o jsonpath='{.data.tls\.crt}' |base64 -d |openssl x509 -noout -text |grep After Not After : Mar 2 08:46:17 2024 GMT
Additional Information
Impact/Risks: Cannot set the TKG service configurations exposed to the vCenter through the plugin.