Sometimes maestro CLI commands (e.g. maestro topology) didn't produce any output for a long time. If the command was run inside of a ssh session and inactivity timer was enabled, the ssh session could be aborted due to inactivity timeout (e.g. 10 mins) which caused maestro CLI command to be interrupted as well.
maestro topology debug logs show that it firstly retrieved all stored certificates from Credhub, then submitted a API call for each certificate to get the details. For example:
[DEBUG] GET /api/v1/certificates/ HTTP/1.1 Host: x.x.x.x:8844 User-Agent: Go-http-client/1.1 Content-Type: application/json Accept-Encoding: gzip [DEBUG] HTTP/1.1 200 Transfer-Encoding: chunked Cache-Control: no-cache, no-store, max-age=0, must-revalidate Content-Type: application/json;charset=UTF-8 Date: Thu, 03 Feb 2022 11:35:36 GMT Expires: 0 Pragma: no-cache Strict-Transport-Security: max-age=31536000 ; includeSubDomains X-Content-Type-Options: nosniff X-Frame-Options: DENY X-Xss-Protection: 1; mode=block 696cc1 {"certificates":[{"name":"/p-bosh/service-instance_cae62e8c-58cd-436b-885f-a34ccbb6b534/streaming_backup_server_cert","versions":[{"id":"38f72086-07ff-4a42-aeb2-851bc4716b86","expiry_date":"2026-12-16T09:00:25Z","transitional":false,"certificate_authority":false,"self_signed":false,"generated":true},{"id":"8c9a7a99-9261-4ad3-b959-755db0743df3","expiry_date":"2026-10-07T06:57:31Z","transitional":false,"certificate_authority":false,"self_signed":false,"generated":true}, ....... [DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc%2Fgemfire-locator-certificate HTTP/1.1 [DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc%2Fgemfire-server-certificate HTTP/1.1 [DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_85b08411-253c-41c6-a902-d59e6c0e6d01%2Fgemfire-locator-certificate HTTP/1.1 [DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_85b08411-253c-41c6-a902-d59e6c0e6d01%2Fgemfire-server-certificate HTTP/1.1 ......
Further analysis showed that there were a huge amount of stale certificates which were associated with non-existing "VMware Tanzu GemFire for VMs" (formerly called Pivotal Cloud Cache) service instances.
When a service instance was created, service broker may also create some credentials (including certificates) in Credhub for the service instance. When the service instance was deleted, service broker should also remove the associated credentials from Credhub. However in this case "VMware Tanzu GemFire for VMs" service broker didn't clean the related credentials in Credhub. Therefore, more and more stale credentials were accumulated in Credhub with the service instance being created and deleted. Eventually maestro CLI command performance was affected.
The issue was already reported to product team. Before a fix is ready, user can resolve the issue by manually delete stale credentials (related to non-existing "VMware Tanzu GemFire for VMs" service instances) from Credhub with following steps:
1. Log into Credhub instance on BOSH director
2. Export all credentials from Credhub first as a backup
$ credhub export -f <file name to store output>
3. Try to delete one state credential to make sure it works
$ credhub delete -n /p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-locator-certificate
4. If above step succeeds, continue to delete all other stale credentials in a batch
$ for cred in `cat gf-stale-certs.out`;do echo "Deleting $cred ..." >> delete-stale-creds.out; credhub delete -n $cred;done;
File gf-stale-certs.out contains all stale credentials as shown in example below. All credentials (certificates) could be found in maestro topology --debug output. Consider to increase ssh session timeout if needed to make sure the command won't be interrupted in the middle.
$head -n 5 gf-stale-certs.out /p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-locator-certificate /p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-server-certificate /p-bosh/service-instance_a57203c1-50e9-4c26-a2f8-074b7fa7e83f/gemfire-locator-certificate /p-bosh/service-instance_a57203c1-50e9-4c26-a2f8-074b7fa7e83f/gemfire-server-certificate /p-bosh/service-instance_8867e525-61e0-4bf4-a261-8b1a8b84f44a/gemfire-locator-certificate ......