This issue presents with users not able to connect to the platform using the cf
CLI with the following message:
API down for maintenance
This particular message shows when the Cloud Controller API (CAPI) has been locked due to a BBR backup being performed. Please see the following documentation for more details:
This login message will also persist following a failed backup. This is because the backup unlock scripts outlined in the above article are not run when a backup fails. This state can be confirmed by using the bosh
CLI to SSH into one of the impacted Cloud Controllers and then execute the following command:
sudo /var/vcap/jobs/bpm/bin/bpm list | grep nginx_maintenance | awk '{ print "nginx_maintenance is:", $3;}'
If the above commands returns nginx_maintenance is: running
, then the Cloud Controller VM is likely locked due to the pre-backup locks.
In the event of a failed backup, the backup cleanup scripts must be manually executed using the bbr
CLI. This can be done from the Opsman VM using the workflow outlined in step 4 here and summarized below:
# From the Opsman VM
bbr deployment \
--target BOSH-DIRECTOR-IP \
--username BOSH-CLIENT \
--password BOSH-PASSWORD \
--deployment CF-DEPLOYMENT-NAME \
--ca-cert /var/tempest/workspaces/default/root_ca_certificate \
backup-cleanup