VCF Operations (VCF 9) fails to retain SFTP Backup Settings due to SDDC Manager Resource Locks

Products

VCF Operations VMware SDDC Manager / VCF Installer VMware Cloud Foundation

Issue/Introduction

When attempting to configure File-Based Backups in VCF Operations > Administration > SDDC Manager > Backup Settings, users may experience the following:

Backup site settings are accepted by the UI without error.
Upon refreshing the page, the settings are discarded and not retained.
Manual or scheduled backups cannot be executed.
Manual SFTP connectivity from the SDDC Manager console is successful, but the application fails to save the configuration
SDDC Manager - Log analysis of /var/log/vmware/vcf/operationsmanager/operationsmanager.log reveals
YYYY-MM-DDTHH:MM:SS DEBUG [vcf_om,69d694a322c7278684b7e1ab3e015e36,052b] [c.v.e.s.c.s.a.l.LockingServiceAdapterImpl,pool-2-thread-20] Execution name BACKUP_CONFIGURATION_WORKFLOW, execution ID ########-####-####-####-########, resource type DEPLOYMENT, resource ID null.
YYYY-MM-DDTHH:MM:SS DEBUG [vcf_om,69d694a322c7278684b7e1ab3e015e36,052b] [c.v.e.s.c.util.ContinueOnFailureUtil,pool-2-thread-20] Tracking ContinueOnFailure null flag, Max Retry Count 3 and pre Validate Retry Count 3
YYYY-MM-DDTHH:MM:SS ERROR [vcf_om,69d694a322c7278684b7e1ab3e015e36,052b] [c.v.e.s.o.model.error.ErrorFactory,pool-2-thread-20] [7MK7P8] SDDC_MANAGER_LOCK_NOT_AVAILABLE Lock is unavailable : Deployment level lock cannot be acquired. There are existing resource locks..
com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Lock is unavailable : Deployment level lock cannot be acquired. There are existing resource locks..

Environment

VCF 9.x

Cause

This issue is caused by stale or orphaned deployment-level locks within the SDDC Manager orchestrator. When a BACKUP_CONFIGURATION_WORKFLOW attempt is made, it fails to acquire the necessary locks because they are already held by a previous or stalled operation.

Resolution

To resolve this issue, the stale locks must be identified and manually cleared from both the SDDC Manager API and the underlying PostgreSQL database.

Prerequisites

Take a snapshot of the SDDC Manager VM before performing database modifications.
Ensure you have root access to the SDDC Manager console.

Step 1: Identify Stale Locks

Check for active locks via the local API:

curl http://localhost/locks | json_pp
Query the SDDC Manager database to find the lock IDs:

psql -h localhost -U postgres -d platform -c "select * from lock;"

Note the id of the locks (e.g., 702b1d7a-2e74-4451-8279-2ec4ecfc8257).

Step 2: Clear Locks from API

Export the locks to a JSON file

cd /tmp
curl http://localhost/locks | json_pp > releaseLock.json
Edit releaseLock.json and remove the leading and trailing square brackets [ ], then save the file.

vi releaseLock.json
Delete the leading and trailing square brackets [ ] and save the file (:wq)
Push the modified file back to the API to release the locks:

curl -X PUT -H "Content-Type:application/json" http://localhost/locks -d @releaseLock.json

Step 3: Clear Stale Database Entries

If the lock persists in the database after the API call:

Delete the lock by its ID:

psql -h localhost -U postgres -d platform -c "delete from lock where id='<LOCK_ID_FROM_STEP_1>';"

Step 4: Restart Services

Restart the SDDC Manager services to ensure a clean state:/opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

Step 5: Re-configure Backup Settings

Log back into VCF Operations.
Navigate to Backup Settings and re-enter the SFTP Site Settings.
Refresh the UI to verify the settings are now retained.
Configure the Backup Schedule and click Backup Now to verify the fix.

Additional Information

File-Based Backups for SDDC Manager NSX Manager and vCenter