vCenter Server VAMI backup fails intermittently with Error: "BackupManager encountered an exception"
search cancel

vCenter Server VAMI backup fails intermittently with Error: "BackupManager encountered an exception"

book

Article ID: 316517

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • VAMI based vCenter server backup fails intermittently.
  • Error on the VAMI Console:  "BackUp Manager encountered an Exception"
  • Subsequent attempts for the Back-Up jobs complete successfully with no errors. 
  • Manual backup works as expected.
Log File Location on the vcenter server :  /var/log/vmware/applmgmt/backup.log
 
Backup failure log snippets scenarios:   

YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is -13, but expected exit codes are [0]. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [Proc::GetProcsStatus:Proc.py:327] ERROR: rc: 1, stderr: Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/plugins/../util/Calculate.py", line 59, in <module>
    main(sys.argv[1], sys.argv[2], sys.argv[3])
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/plugins/../util/Calculate.py", line 46, in main
    stdout_obj.write(data)
BrokenPipeError: [Errno 32] Broken pipe  
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [Proc::GetProcsStatus:Proc.py:332] INFO: Skip to report the error. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [Proc::GetProcsStatus:Proc.py:345] ERROR: Process returncode is 1, but expected exit codes are [0]. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [Proc::UpdateExceptionStatus:Proc.py:383] ERROR: Checksum not generated at /dev/shm/backupRestoreSumFile-XXXXXXXXXXXXXXXX-m56phujs 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [StatsMonitorDBBackup:PID-23629] [StatsMonitorDB::BackupStatsMonitorDB:StatsMonitorDB.py:125] ERROR: Failed to dispatch dump image of Appliance Stats Monitor database.
Underlying process status. rc: -13 stdout: 
stderr: 
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/StatsMonitorDB.py", line 111, in BackupStatsMonitorDB
    db_path, dump_file)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/StatsMonitorDB.py", line 55, in _dump_sqlite_db
    stdout=PIPE, stdout_fn=dispatch_data)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/util/Proc.py", line 433, in RunCmd
    result = stdout_fn(process.stdout)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/StatsMonitorDB.py", line 50, in dispatch_data
    status)
util.Common.BackupRestoreError: Failed to dispatch dump image of Appliance Stats Monitor database.
Underlying process status. rc: -13
stdout: 
stderr:  
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [VCDB-WAL-Backup:PID-23655] [VCDB::_backup_wal_files:VCDB.py:798] INFO: VCDB backup WAL start not received yet. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [MainProcess:PID-23351] [Proc::VerifyProcStatusAndGetArchive:Proc.py:158] ERROR: Error at process StatsMonitorDBBackup; rc:-13. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [MainProcess:PID-23351] [Proc::VerifyProcStatusAndGetArchive:Proc.py:162] ERROR: stderr:Failed to dispatch dump image of Appliance Stats Monitor database.  
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [MainProcess:PID-23351] [Proc::VerifyProcStatusAndGetArchive:Proc.py:172] INFO: Following error message isn't localized:
  stderr:Failed to dispatch dump image of Appliance Stats Monitor database.  
 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [MainProcess:PID-23351] [BackupManager::main:BackupManager.py:592] ERROR: BackupManager encountered an exception: Hit exception inside process StatsMonitorDBBackup: Checksum not generated at /dev/shm/backupRestoreSumFile-XXXXXXXXXXXXXXXX-m56phujs 
 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [MainProcess:PID-23351] [BackupManager::Cleanup:BackupManager.py:406] ERROR: Failed to clean up backup child processes.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py", line 583, in main
    backupObj.DoBackup()
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py", line 335, in DoBackup
    self.LaunchBackupProcesses()
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py", line 302, in LaunchBackupProcesses
    self.ExecBackupsInParallel()
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/BackupManager.py", line 272, in ExecBackupsInParallel
    taskId=self.args.id, operation='BACKUP')
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/util/Proc.py", line 202, in LaunchMultipleProcesses
    timeout, logger)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/util/Proc.py", line 178, in VerifyProcStatusAndGetArchive
    (procRecord.process.name, procRecord.status.excMsg))
Exception: Hit exception inside process StatsMonitorDBBackup: Checksum not generated at /dev/shm/backupRestoreSumFile-XXXXXXXXXXXXXXXX-m56phujs 
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/util/Proc.py", line 251, in CleanupChildProcesses
    proc.wait(timeout=30)
  File "/usr/lib/python3.7/site-packages/psutil/__init__.py", line 1262, in wait
    return self._proc.wait(timeout)
  File "/usr/lib/python3.7/site-packages/psutil/_pslinux.py", line 1459, in wrapper
    return fun(self, *args, **kwargs)
  File "/usr/lib/python3.7/site-packages/psutil/_pslinux.py", line 1637, in wait
    return _psposix.wait_pid(self.pid, timeout, self._name)
  File "/usr/lib/python3.7/site-packages/psutil/_psposix.py", line 104, in wait_pid
    delay = check_timeout(delay)
  File "/usr/lib/python3.7/site-packages/psutil/_psposix.py", line 66, in check_timeout
    raise TimeoutExpired(timeout, pid=pid, name=proc_name)
psutil._exceptions.TimeoutExpired: psutil.TimeoutExpired timeout after 30 seconds (pid=23627)
 
During handling of the above exception, another exception occurred:
 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [VCDBBackup:PID-23627] [VCDB::BackupVCDB:VCDB.py:2057] ERROR: Encounter error during backup VCDB.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 1993, in BackupVCDB
    br_state.isFastBackupRequired())
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 571, in _start_pg_backup
    "backupfast" : 'true' if backup_fast else 'false'})
psycopg2.OperationalError: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request. 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [VCDBBackup:PID-23627] [Proc::UpdateExceptionStatus:Proc.py:383] ERROR: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.  
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXXXX] [VCDBBackup:PID-23627] [VCDB::BackupVCDB:VCDB.py:2070] INFO: Terminate sub process 23655
 
During handling of the above exception, another exception occurred:
 
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXX] [VCDB-WAL-Backup:PID-XXXXXXX] [VCDB::run:VCDB.py:1111] ERROR: Failed to backup WAL files.
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXX] [VCDB-WAL-Backup:PID-XXXXXXX] [VCDB::run:VCDB.py:1112] ERROR: Failed to dispatch WAL meta.
Underlying process status. rc: 9
stdout:
stderr: b'curl: (9) Upload failed: Permission denied (3/-31)\n'
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXXXXXXX] [VCDBBackup:PID-1564169] [VCDB::BackupVCDB:VCDB.py:2057] ERROR: Encounter error during backup VCDB.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 2030, in BackupVCDB
    wal_backup_status = status_queue.get()
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
TypeError: BackupRestoreError.__init__() missing 1 required positional argument: 'status'
[YYYY-MM-DDTHH:MM:SS] [XXXXXXXXX] [VCDBBackup:PID-XXXX] [Proc::UpdateExceptionStatus:Proc.py:384] ERROR: BackupRestoreError.__init__() missing 1 required positional argument: 'status'


Environment

VMware vCenter Server 7.0.3
VMware vCenter Server 7.0.0
VMware vCenter Server 8.x

Cause

The backup failure error reported here is: "BrokenPipeError: [Errno 32] Broken pipe"
This primarily indicates an intermittent network issue. Although the manual upload has passed, such errors are hard to trace and can happen anytime there is a minor glitch between the VC and the backup.


Resolution

To avoid conflicts with other scheduled operations, please reschedule the backup to a different time slot (for example, adjust the existing backup from 12:00 PM to 2:00 PM)