Backup or restore fails because it's already running

search cancel

Backup or restore fails because it's already running

book

Article ID: 430312

calendar_today

Updated On:

Products

VCF Automation

Issue/Introduction

Backup or restore request fails with message "Backup/restore is already running, waiting to retry". Subsequent backup/restore requests cannot run for up to 1 hour.

Environment

VCF 9.0

Cause

Backup and restore workflows acquire a lock to ensure only a single backup/restore process can run at the same time. This lock is released on success or failure. However, in some rare scenarios the process is terminated before the lock can be cleaned up. This lock has expiry of 1 hour, so when this happens, backup/restore flows cannot run until the lock expires.

Resolution

If 1-hour delay before the next backup/restore execution is not acceptable, the lock can be cleaned up manually. Before executing the following steps, make sure no backup/restore workflows are running.

Identify one of the VMs that belong to VCF Automation or Identity Broker and locate its IP address.
SSH into that VM and delete the lock

 
ssh vmware-system-user@<node ip>
sudo -i
export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl delete cm vmsp-backup-state -n vmsp-platform

3. Run backup/restore.

Feedback

thumb_up Yes

thumb_down No