Maestro CLI commands take a long time to complete
search cancel

Maestro CLI commands take a long time to complete

book

Article ID: 293668

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

Sometimes maestro CLI commands (e.g. maestro topology) didn't produce any output for a long time. If the command was run inside of a ssh session and inactivity timer was enabled, the ssh session could be aborted due to inactivity timeout (e.g. 10 mins) which caused maestro CLI command to be interrupted as well. 
maestro topology debug logs show that it firstly retrieved all stored certificates from Credhub, then submitted a API call for each certificate to get the details. For example:

[DEBUG] GET /api/v1/certificates/ HTTP/1.1
Host: x.x.x.x:8844
User-Agent: Go-http-client/1.1
Content-Type: application/json
Accept-Encoding: gzip

[DEBUG] HTTP/1.1 200
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: application/json;charset=UTF-8
Date: Thu, 03 Feb 2022 11:35:36 GMT
Expires: 0
Pragma: no-cache
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Xss-Protection: 1; mode=block

696cc1
{"certificates":[{"name":"/p-bosh/service-instance_cae62e8c-58cd-436b-885f-a34ccbb6b534/streaming_backup_server_cert","versions":[{"id":"38f72086-07ff-4a42-aeb2-851bc4716b86","expiry_date":"2026-12-16T09:00:25Z","transitional":false,"certificate_authority":false,"self_signed":false,"generated":true},{"id":"8c9a7a99-9261-4ad3-b959-755db0743df3","expiry_date":"2026-10-07T06:57:31Z","transitional":false,"certificate_authority":false,"self_signed":false,"generated":true},
.......
[DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc%2Fgemfire-locator-certificate HTTP/1.1
[DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc%2Fgemfire-server-certificate HTTP/1.1
[DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_85b08411-253c-41c6-a902-d59e6c0e6d01%2Fgemfire-locator-certificate HTTP/1.1
[DEBUG] GET /api/v1/data?name=%2Fp-bosh%2Fservice-instance_85b08411-253c-41c6-a902-d59e6c0e6d01%2Fgemfire-server-certificate HTTP/1.1
......

 

Further analysis showed that there were a huge amount of stale certificates which were associated with non-existing "VMware Tanzu GemFire for VMs" (formerly called Pivotal Cloud Cache) service instances. 

When a service instance was created, service broker may also create some credentials (including certificates) in Credhub for the service instance. When the service instance was deleted,  service broker should also remove the associated credentials from Credhub. However in this case "VMware Tanzu GemFire for VMs" service broker didn't clean the related credentials in Credhub. Therefore, more and more stale credentials were accumulated in Credhub with the service instance being created and deleted.  Eventually maestro CLI command performance was affected. 


Environment

Product Version: 2.7

Resolution

The issue was already reported to product team. Before a fix is ready, user can resolve the issue by manually delete stale credentials (related to non-existing "VMware Tanzu GemFire for VMs" service instances) from Credhub with following steps: 
 

1. Log into Credhub instance on BOSH director
 

2. Export all credentials from Credhub first as a backup

$ credhub export -f <file name to store output> 


3. Try to delete one state credential to make sure it works

$ credhub delete -n /p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-locator-certificate 


4. If above step succeeds, continue to delete all other stale credentials in a batch

$ for cred in `cat gf-stale-certs.out`;do echo "Deleting $cred ..." >> delete-stale-creds.out; credhub delete -n $cred;done; 


File gf-stale-certs.out contains all stale credentials as shown in example below. All credentials (certificates) could be found in maestro topology --debug output. Consider to increase ssh session timeout if needed to make sure the command won't be interrupted in the middle.

$head -n 5 gf-stale-certs.out
/p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-locator-certificate
/p-bosh/service-instance_a00a2a5f-0778-45e6-a3e6-04bc8a77d4cc/gemfire-server-certificate
/p-bosh/service-instance_a57203c1-50e9-4c26-a2f8-074b7fa7e83f/gemfire-locator-certificate
/p-bosh/service-instance_a57203c1-50e9-4c26-a2f8-074b7fa7e83f/gemfire-server-certificate
/p-bosh/service-instance_8867e525-61e0-4bf4-a261-8b1a8b84f44a/gemfire-locator-certificate
......