vCenter 6.7 backup fails with ERROR: Timeout! Failed to complete in 72000 seconds after stuck at 95%
search cancel

vCenter 6.7 backup fails with ERROR: Timeout! Failed to complete in 72000 seconds after stuck at 95%

book

Article ID: 318486

calendar_today

Updated On: 03-10-2025

Products

VMware vCenter Server

Issue/Introduction

  • vCenter database is greater than 300 GB.  Verify through command line on the VCSA or the VAMI.

    • Run the following command df -h on the vCenter Server Appliance command line:

df -h

    • Log into the vSphere Appliance Management Interface (VAMI) and select Monitor > Disks

  • Backup progress is stuck at 95% in VAMI
  • BackupManager.py process is in a sleeping state.  To confirm, run the below steps:
    1. Collect the process IDs (PIDs) for BackupManager.py
root@vcsa [~]# ps -eaf | grep "backup"

root     13443  1844  2 Jul23 ?        00:22:45 /usr/bin/python /usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py
root     13473 13443  0 Jul23 ?        00:00:00 /usr/bin/python /usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py
    1. Confirm the process state for PIDs found in the first step are sleeping.
root@vcsa [~]# cat /proc/13443/status

Name: python
State: S (sleeping)
Tgid: 13473
Ngid: 0
Pid: 13473
PPid: 13443

<snip>



Environment

VMware vCenter Server Appliance 6.7.x

Cause

When backing up large vCenter databases (>300 GB) because different parts of the backup process need to coordinate handling large amounts of data, these parts get stuck waiting for each other (referred to as a "deadlock"), gets stuck at 95% completion, and eventually times out after 72,000 seconds (20 hours).

Resolution

This issue is resolved in vCenter Server 6.7 U3j and in vCenter Server 7.0.

However, If using vCenter 6.7, it's strongly advised to update to the latest vCenter 6.7 release. If using vCenter Server 7.0, it's advised to update to the latest vCenter 7.0 release. Please find the vCenter Server download packages by doing the following:

  1. Login to the Broadcom Support Portal
  2. Select "My Downloads"
  3. Select "VMware vCenter Server"
  4. Select the major version, i.e.
    "VMware vCenter Server 7.x"
  5. Select the most recent release, such as, "7.0U3t"

Workaround:

  1. Take a backup of the file /usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/util/Proc.py    
  2.  Modify the file by moving lines 140 and 141:
140 procRecord.process.join(timeout=timeout)
141 procRecord.joined = True
    After line 144:
144 procRecord.status = procRecord.process.statusQ.get(False)
  1. Restart applmgmt service (service-control --restart applmgmt)
  2. Start backup process.