This issue presents with users not able to connect to the platform using the cf
CLI with the following message:
API down for maintenance
This particular message shows when the Cloud Controller API (CAPI) has been locked due to a BBR backup being performed. Please see the following documentation for more details:
This login message will also persist following a failed backup. This is because the backup unlock scripts outlined in the above article are not run when a backup fails. This state can be confirmed by using the bosh
CLI to SSH into one of the impacted Cloud Controllers and then execute the following command:
sudo /var/vcap/jobs/bpm/bin/bpm list | grep nginx_maintenance | awk '{ print "nginx_maintenance is:", $3;}'
If the above commands returns nginx_maintenance is: running
, then the Cloud Controller VM is likely locked due to the pre-backup locks.
In the event of a failed backup, the backup cleanup scripts must be manually executed using the bbr
CLI. This can be done from the Opsman VM using the workflow outlined in step 11 here and summarised below:
# From the Opsman VM
bbr deployment \
--target BOSH-DIRECTOR-IP \
--username BOSH-CLIENT \
--password BOSH-PASSWORD \
--deployment CF-DEPLOYMENT-NAME \
--ca-cert /var/tempest/workspaces/default/root_ca_certificate \
backup-cleanup