Error: 'status 400 reading CredhubClient#regenerateCertificateById' returned when running command tkgi rotate-certificates <clustername>
search cancel

Error: 'status 400 reading CredhubClient#regenerateCertificateById' returned when running command tkgi rotate-certificates <clustername>

book

Article ID: 409984

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

When trying to rotate certificates for a particular cluster using any variation of the following command:

tkgi rotate-certificates <clustername>

The rotation fails with the following error message:

Error: status 400 reading CredhubClient#regenerateCertificateById(String,CertificateRegenerateRequest,String)'

 

It is important to mention that this error can be caused by the manual use of maestro commands to rotate certificate authorities. In the TKGI documentation you can find warnings regarding this scenario.

In the document Rotate Kubernetes Cluster Certificates, you will find the following warning:

Never use the CredHub Maestro maestro regenerate ca/leaf –all command to rotate TKGI certificates.

If this is the scenario you are facing, please review article "Wrongly kicked "maestro regenerate ca/leaf --all" in TKGi" and open a case with the support team for troubleshooting.

Cause

This issue can occur when one or more certificate authorities used by the cluster have a preexisting certificate version that is marked as transitional. This has been observed when certificates are manually updated in Credhub, but the rotation process is not completed on the TKGI clusters.

 

 

Resolution

Get the control plane deployment logs (pivotal-container-service deployment) running the following:

bosh -d pivotal-container-service-<ID> logs

 

In the log bundle obtained from the previous step, find the pks-api folder and inside look for file pks-api.log. You will find the following:

2025-XX-XX XX:XX:XX.XXX  INFO 1661158 --- [https-jsse-nio-9021-exec-2] i.p.pks.bosh.credhub.CredhubService      : Regenerate certificate credential: certificate: 1ef3ca6d-XXXX-XXXX-XXXX-XXXXXXXXad15, transitional: true, allowTransitionalParentToSign: false

202X-XX-XX XX:XX:XX.XXX ERROR XXXXXXX --- [https-jsse-nio-9021-exec-8] i.p.pks.bosh.credhub.CredhubService      : Failed to regenerate certificate 400 {"error":"The maximum number of transitional versions for a given CA is 1."} status 400 reading CredhubClient#regenerateCertificateById(String,CertificateRegenerateRequest,String)

 

Double check this by reviewing the maestro topology from the support bundle. In the Support Bundle locate the "certificates" folder and inside, search the file called maestro_topology.yml

If the maestro topology shows that some, but not all the certificate authorities present have a new version that is marked as transitional you will need to identify the clusters affected.

There are two searches you can do here:

Using the certificate ID from the error found in the pks-api logs, search for the specific certificate causing the error.

If you can to check if there are any other certificates presenting the same behavior, search for the value "transitional: true". Below you can find an example.

  - name: "/p-bosh/service-instance_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/etcd_ca_2018"
    certificate_id: 12348578-0123-1a2b-3c4d-1a2b3c4d5e6f
    signed_by: "/p-bosh/service-instance_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/etcd_ca_2018"
    versions:
    - version_id: z1y2x3w4-1234-9876-2468-a1b2c3d4e5f6
      active: false
      signed_by_version: ''
      deployment_names: []
      signing: false
      transitional: true
      certificate_authority: true
      generated: true
      valid_until: '2028-06-23T20:59:21Z'

 

Once you find the certificates, record the service instance IDs. Using the certificate version number, confirm that the transitional version is NOT signing any existing leaf certificates.

You can also check if the transitional version has been already pushed to the deployment VMs. To do this:

  • ssh into a master node of any of the service instances identified on the step above
  • Go to /var/vcap/jobs/etcd/config and check the certificate in file etcd_ca.crt

 

If the above criteria is met, manual Credhub operations will be required in order to solve the errors related to the preexisting transitional certificate versions. Please open a case with support.