Update the Certificate Trust Chain for VCFA Serving Certificates
search cancel

Update the Certificate Trust Chain for VCFA Serving Certificates

book

Article ID: 420053

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VCF Automation

Issue/Introduction

If the certificate issuer of the serving certificate for VCFA is changed, VKS clusters that are in the process of being brought under VKSM management may become stuck without VKSM agents being deployed on them. The observed symptom will be that the pod running the tmc-bootstrapper job in the vmware-system-vksm namespace will be in an error state

$ kubectl -n vmware-system-vksm get po --kubeconfig ./kubernetes-cluster-qo7v-kubeconfig.yaml
NAME                     READY   STATUS    RESTARTS   AGE
tmc-bootstrapper-mzwh5   0/1     Error     0          8m3s

And the logs from the pod would have

devops@cli-vm:~$ kubectl -n vmware-system-vksm logs  tmc-bootstrapper-mzwh5
{"error":"x509: certificate signed by unknown authority","level":"error","msg":"verify TLS connection","sub-component":"tmc-context","time":""}
{"error":"x509: certificate signed by unknown authority","level":"error","msg":"verify TLS connection","sub-component":"tmc-context","time":""}
failed to build bootstrapper context failed to initialize gRPC connection: could not connect to VKSM endpoint: context deadline exceeded

 

Environment

VMware Cloud Foundation Automation (VCFA)
vSphere Kubernetes Service

Cause

This issue occurs when the Certificate Authority (CA) Root certificate used to sign the VCFA serving certificate is missing from the platform's trusted certificate bundle. This issue is more likely when using self-signed certificate authorities, and less likely when using publicly trusted well-known certificate authorities like LetsEncrypt, DigiCert etc.

The VCFA platform constructs its trust bundle (stored in the platform-trust ConfigMap) dynamically using cert-manager. This bundle is composed of:

  1. Default public CAs.
  2. Custom CAs derived from Kubernetes Secrets in the vmsp-platform namespace that bear a specific label.

If the Root CA for your custom serving certificate is not present as a labeled Secret in the vmsp-platform namespace, it will not be included in the platform-trust bundle. Consequently, services like tmc-bootstrapper will reject the serving certificate as untrusted.

Resolution

To resolve this issue, you must manually create a Kubernetes secret in the vmsp-platform namespace of the VMSP kubernetes cluster with the CA certificate. This secret should also have a label trust.vmsp.vmware.com/bundle=platform-trust

Procedure

Step 1: Prepare the Root CA Certificate

Ensure you have the Root CA certificate file available on your jump host or machine where kubectl is installed.

  • If you have the certificate content, save it to a file named custom-root-ca.crt.
  • Ensure the file contains the -----BEGIN CERTIFICATE----- and -----END CERTIFICATE----- blocks.

Step 2: Create the Kubernetes Secret

You must create a generic secret in the vmsp-platform namespace containing your CA certificate.

Run the following command:

kubectl create secret generic custom-self-signed-ca-cert \
 
--from-file=ca.crt=./custom-root-ca.crt \
 
--namespace=vmsp-platform

Note: You can name the secret (custom-self-signed-ca-cert) whatever you prefer, but the key inside the secret must be ca.crt or similar key that cert-manager expects (though from-file usually handles this, ensuring the key is ca.crt is best practice).

Step 3: Make it part of the platform-trust bundle

The cert-manager process watches for secrets with a specific label to include them in the trust bundle. Apply the label trust.vmsp.vmware.com/bundle=platform-trust to your newly created secret.

kubectl label secret custom-self-signed-ca-cert \
  trust.vmsp.vmware.com/bundle=platform-trust \
  --namespace=vmsp-platform

Step 4: Verify the Trust Bundle Update

The reconciliation process is automatic but may take a few moments. Verify that your CA has been added to the platform-trust ConfigMap.

kubectl get configmap platform-trust -n prelude -o jsonpath='{.data.bundle\.pem}' > current-bundle.pem

  1. Search for your CA:
    Search the current bundle.pem file for your specific CA Subject or Issuer to confirm it is present.
  2. Below are handly commands that would be useful in this verification process

# create separate files for each certificate in the 'current-bundle.pem' file

awk 'BEGIN {c=0}
     /-----BEGIN CERTIFICATE-----/ {c++}
     {print > ("cert_" c ".pem")}
     END {print c " certificates extracted"}' current-bundle.pem

# decode each of the extracted certificates, inspecting the subject of each certificate. Update 'N' to number of certificates extracted from the above command.

for i in $(seq 1 N); do CN=$(openssl x509 -noout -subject -in "cert_$i.pem"); echo cert_$i.pem: $CN; done

  1. Wait till the bundle is updated successfully with the new CA certificate. 

Step 5: In the attached VKS cluster, observe that the tmc-bootstrapper job has run successfully and is deploying necessary VKSM agents

If the tmc-bootstrapper pod does not automatically become running, delete the tmc-bootstrapper job in the vmware-system-vksm  namespace.

kubectl -n vmware-system-vksm delete job tmc-bootstrapper

This should result in the job getting recreated by software. The pod created for this newly created job should start running successfully and making progress.

Validation

After performing the resolution steps, verify the fix:

  1. Monitor the recreated job or pod logs.
  2. Confirm that the logs no longer show x509 or self-signed certificate errors.
  3. Ensure the job completes with a status of Completed or Running.