If the certificate issuer of the serving certificate for VCFA is changed, VKS clusters that are in the process of being brought under VKSM management may become stuck without VKSM agents being deployed on them. The observed symptom will be that the pod running the tmc-bootstrapper job in the vmware-system-vksm namespace will be in an error state
$ kubectl -n vmware-system-vksm get po --kubeconfig ./kubernetes-cluster-qo7v-kubeconfig.yamlNAME READY STATUS RESTARTS AGEtmc-bootstrapper-mzwh5 0/1 Error 0 8m3s
And the logs from the pod would have
devops@cli-vm:~$ kubectl -n vmware-system-vksm logs tmc-bootstrapper-mzwh5{"error":"x509: certificate signed by unknown authority","level":"error","msg":"verify TLS connection","sub-component":"tmc-context","time":""}{"error":"x509: certificate signed by unknown authority","level":"error","msg":"verify TLS connection","sub-component":"tmc-context","time":""}failed to build bootstrapper context failed to initialize gRPC connection: could not connect to VKSM endpoint: context deadline exceeded
VMware Cloud Foundation Automation (VCFA)
vSphere Kubernetes Service
This issue occurs when the Certificate Authority (CA) Root certificate used to sign the VCFA serving certificate is missing from the platform's trusted certificate bundle. This issue is more likely when using self-signed certificate authorities, and less likely when using publicly trusted well-known certificate authorities like LetsEncrypt, DigiCert etc.
The VCFA platform constructs its trust bundle (stored in the platform-trust ConfigMap) dynamically using cert-manager. This bundle is composed of:
If the Root CA for your custom serving certificate is not present as a labeled Secret in the vmsp-platform namespace, it will not be included in the platform-trust bundle. Consequently, services like tmc-bootstrapper will reject the serving certificate as untrusted.
To resolve this issue, you must manually create a Kubernetes secret in the vmsp-platform namespace of the VMSP kubernetes cluster with the CA certificate. This secret should also have a label trust.vmsp.vmware.com/bundle=platform-trust
Ensure you have the Root CA certificate file available on your jump host or machine where kubectl is installed.
You must create a generic secret in the vmsp-platform namespace containing your CA certificate.
Run the following command:
kubectl create secret generic custom-self-signed-ca-cert \
--from-file=ca.crt=./custom-root-ca.crt \
--namespace=vmsp-platform
Note: You can name the secret (custom-self-signed-ca-cert) whatever you prefer, but the key inside the secret must be ca.crt or similar key that cert-manager expects (though from-file usually handles this, ensuring the key is ca.crt is best practice).
The cert-manager process watches for secrets with a specific label to include them in the trust bundle. Apply the label trust.vmsp.vmware.com/bundle=platform-trust to your newly created secret.
kubectl label secret custom-self-signed-ca-cert \ trust.vmsp.vmware.com/bundle=platform-trust \ --namespace=vmsp-platform
The reconciliation process is automatic but may take a few moments. Verify that your CA has been added to the platform-trust ConfigMap.
kubectl get configmap platform-trust -n prelude -o jsonpath='{.data.bundle\.pem}' > current-bundle.pem
# create separate files for each certificate in the 'current-bundle.pem' file
awk 'BEGIN {c=0} /-----BEGIN CERTIFICATE-----/ {c++} {print > ("cert_" c ".pem")} END {print c " certificates extracted"}' current-bundle.pem
# decode each of the extracted certificates, inspecting the subject of each certificate. Update 'N' to number of certificates extracted from the above command.
for i in $(seq 1 N); do CN=$(openssl x509 -noout -subject -in "cert_$i.pem"); echo cert_$i.pem: $CN; done
If the tmc-bootstrapper pod does not automatically become running, delete the tmc-bootstrapper job in the vmware-system-vksm namespace.
kubectl -n vmware-system-vksm delete job tmc-bootstrapper
This should result in the job getting recreated by software. The pod created for this newly created job should start running successfully and making progress.
After performing the resolution steps, verify the fix: