Supervisor upgrade is stuck with Component Configuration error
search cancel

Supervisor upgrade is stuck with Component Configuration error

book

Article ID: 401674

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

In the vSphere GUI there is a Supervisor Cluster upgrade in progress that is stuck with a "Component Configuration error:"

failed to verify certificate: x509: certificate signed by unknown authority Component upgrade failed.

 

While connected to the Supervisor cluster context, the following symptoms are observed:

  • New Control Plane nodes and Worker nodes have been deployed at the upgraded version
kubectl get nodes -A
  • All pods are up and running on the Supervisor cluster
kubectl get pods -A | grep -v Run
  • The certificate called out in the error message is valid
kubectl get certificates -A | grep <certificate called out in error message>
kubectl describe certificate -n <namespace> <certificate name>
  • kube-apiserver logs show  similar bad certificate error messages:
kubectl get pods -A | grep kube-apiserver
kubectl logs -n kube-system <kube-apiserver pod name>
kubectl logs -n kube-system <kube-apiserver pod name>
           "x509: certificate signed by unknown authority"

Environment

vSphere with Tanzu 7.0

vSphere with Tanzu 8.0

Cause

The cert-manager pod in the Supervisor Cluster ensures secure and automated TLS certificate lifecycle management for various internal services, webhooks, and APIs. This enables zero-touch secure communication within the cluster and its extensions.

If cert-manager certificate validation stops working, a restart might be needed after confirming the certificates are valid.

 

 

Resolution

Perform a rolling restart of the cert-manager deployments

  • List the cert-manager deployments
kubectl get deployment -A | grep cert
             vmware-system-cert-manager                  cert-manager
             vmware-system-cert-manager                  cert-manager-cainjector
             vmware-system-cert-manager                  cert-manager-webhook
  • Perform a rolling reboot of the cert-manager deployments
kubectl rollout restart deployment -n vmware-system-cert-manager cert-manager
kubectl rollout restart deployment -n vmware-system-cert-manager cert-manager-cainjector
kubectl rollout restart deployment -n vmware-system-cert-manager cert-manager-webhook