As a foundation ages with time, the operators perform multiple certificate rotations overtime which leaves many inactive or orphaned versions of certificates which can cause the CredHub database to fill up. This is because CredHub can store a near infinite amount of certificate versions. Those certificates can be seen when using Maestro CLI and running maestro topology . The following output is an example of how inactive certificates look like in the output, please note that the field active is set to false and the deployment_names array is empty meaning that the certificate is not used by any of the deployments:
- version_id: afb1f16d-b84f-4395-8b7d-bfaff20436b6 active: false signed_by_version: '' deployment_names: [] signing: false transitional: false certificate_authority: false generated: true valid_until: '2022-07-02T12:20:03Z' - version_id: b13a1f16-454e-45eb-9ddc-297ad8b1a388 active: false signed_by_version: '' deployment_names: [] signing: false transitional: false certificate_authority: false generated: true valid_until: '2021-12-17T05:42:36Z' - version_id: 867f10fd-5ee0-45f9-acde-8382f33fc837 active: false signed_by_version: '' deployment_names: [] signing: false transitional: false certificate_authority: false generated: valid_until: '2021-02-03T04:19:17Z' - version_id: 0ffac0d9-739d-4af0-afb5-2d18722a9a2c active: false signed_by_version: '' deployment_names: [] signing: false transitional: false certificate_authority: false generated: valid_until: '2020-02-04T15:45:10Z'
Please note: This KB is an extension of the publicly available documentation on deleting inactive certificate versions .
To see if there is a need to clean up space, you can either check the Bosh director’s VM vitals or SSH to the VM and run:
df -h /var/vcap/store
In order to clean up those inactive certificates, you can periodically use CredHub Maestro’s garbage collect functionality to delete old, unused certificate versions. More specifically, it will remove the unused certificate version IDs from storage. The garbage collect feature is smart enough to know only to delete the older, unused version IDs. If you have any concerns you can always perform a backup of the CredHub database using the CredHub CLI prior to running the garbage collect:
credhub export -f /tmp/credentails-from-credhub
When cleaning up the inactive versions of certificates, first run the garbage collect on leaf certificates. You can run it with --dry-run flag to determine which versions will be deleted:
maestro garbage-collect leaf --all --dry-run
If output of --dry-run commands looks like all obsolete leaf certificates then run command without --dry-run :
maestro garbage-collect leaf --all
After inactive versions of leaf certificates have been cleaned up, do the same with the inactive versions of CA certificates. Please ensure that inactive leaf certificates have been cleaned up first before performing garbage collect on CA certificates as it could fail to delete potentially inactive versions of that CA due to the fact that the unused service instance leaf certs are signed by it. Once again, you can run it with --dry-run flag to determine which versions will be deleted:
maestro garbage-collect ca --all --dry-run
If output of --dry-run commands looks like all obsolete CA certificates then run command without --dry-run :
maestro garbage-collect ca --all
Meastro’s garbage collect feature never fully deletes certificates. It only deletes inactive versions of certificates. If there are no longer Bosh deployments associated with a certificate, Maestro cannot safely operate on it. Otherwise, it could unintentionally delete other certificates added manually to CredHub.
This procedure has been tested on Ops Manager v2.10. However, the /services/tls_ca certificate tree is excluded from garbage collect by design in older versions of Ops Man and it will not perform the deletion even though prompted. This can result in "could not delete some certificate versions" error like the following sample output:
ubuntu@opsmgr:~$ maestro garbage-collect ca --all not_deleted: - name: /services/tls_ca certificate_id: 13945990-5549-4e84-a063-f1cea1095827 version_ids: - aa7db5c9-1269-4735-a6e5-13e3cbb08796 error: could not delete some certificate versions
Important: The /services/tls_ca certificate tree is very sensitive and deleting its presumably inactive certificates may result in deleting the versions of certificates that are in fact currently being used. For example if there is a newer transitional version running garbage collect may remove the older still in use version. This is because /services/tls_ca certificate does not show a deployment where it is being utilized and since there are no Bosh deployments associated with a certificate, it confuses the garbage collector.
To avoid this error, run the following command with --exclude flag which excludes Services TLS CA:
maestro garbage-collect ca --all --exclude /services/tls_ca
To enhance the safety of the garbage collect command, CredHub Maestro only deletes the versions of a certificate older than the currently active version.
Important: Please review the section "Special case with /services/tls_ca certificate tree" of this article before attempting this procedure.
To delete all inactive versions of CAs or leaf certificates, add the --force flag to the garbage collect command like in the following example:
maestro garbage-collect leaf --all --force OR maestro garbage-collect ca --all --exclude /services/tls_ca --force
Please note: VMware does not recommend using the --force flag during a rotation.