Symptoms:
- Examining the backup.log file at /var/log/vmware/applmgmt
We can see the following highlighted entries
2019-09-02T01:09:49.553 [VCDBBackup:PID-36378] INFO: Executing command: /opt/vmware/vpostgres/current/bin/psql -U postgres -p 5432 -t -c "select pg_is_in_backup();".
2019-09-02T01:09:49.579 [VCDBBackup:PID-36378] INFO: Postgres on-line backup already finished. Skip stopping backup operation
2019-09-02T01:09:49.579 [VCDBBackup:PID-36378] INFO: Successfully cleaned up for VCDB backup.
2019-09-02T21:01:08.368 [MainProcess:PID-36035] ERROR: Timeout! Failed to complete in 72000 seconds
2019-09-02T21:01:08.369 [MainProcess:PID-36035] ERROR: BackupManager encountered an exception: Timeout! Failed to complete in 72000 seconds.
2019-09-02T21:01:08.371 [MainProcess:PID-36035] INFO: Cleaning up the backup job.
2019-09-02T21:01:08.371 [MainProcess:PID-36035] INFO: Cleaning up all running backup child processes.
2019-09-02T21:01:08.432 [MainProcess:PID-36035] INFO: Cleaning up all sizeFiles.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Cleaning up Lotus.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Successfully completed Lotus cleanup.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Cleaning up ComponentScripts.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Cleaning up ConfigFiles.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Cleaning up StatsMonitorDB.
2019-09-02T21:01:08.433 [MainProcess:PID-36035] INFO: Cleaning up VCDB.
2019-09-02T21:01:08.434 [MainProcess:PID-36035] INFO: Finishing full database backup
2019-09-02T21:01:08.434 [MainProcess:PID-36035] INFO: Retrieving postgres server listening port
2019-09-02T21:01:08.434 [MainProcess:PID-36035] INFO: Executing command: netstat -plnt.
2019-09-02T21:01:08.508 [MainProcess:PID-36035] INFO: Canceling running pg_start_backup() process.
2019-09-02T21:01:08.509 [MainProcess:PID-36035] INFO: Executing command: ['/opt/vmware/vpostgres/current/bin/psql', '-U', 'postgres', '-p', '5432', '-At', '-c', "SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE pid <> pg_backend_pid() and query ~ '^select\\ pg_xlogfile_name\\(pg_start_backup\\(.*\\)\\)\\;' "].
2019-09-02T21:01:08.540 [MainProcess:PID-36035] INFO: No pg_start_backup() process is running.
2019-09-02T21:01:08.541 [MainProcess:PID-36035] INFO: Checking whether Postgres online backup still in progress
2019-09-02T21:01:08.541 [MainProcess:PID-36035] INFO: Executing command: /opt/vmware/vpostgres/current/bin/psql -U postgres -p 5432 -t -c "select pg_is_in_backup();".
2019-09-02T21:01:08.570 [MainProcess:PID-36035] INFO: Postgres on-line backup already finished. Skip stopping backup operation
2019-09-02T21:01:08.571 [MainProcess:PID-36035] INFO: Successfully cleaned up for VCDB backup.
2019-09-02T21:01:08.571 [MainProcess:PID-36035] INFO: Cleaning up failed backup files.
2019-09-02T21:01:08.728 [MainProcess:PID-36035] INFO: Cleaned up the backup job.
2019-09-02T21:01:09.537 [MainProcess:PID-36035] INFO: Event com.vmware.applmgmt.backup.job.failed.event successfully posted to
http://localhost:8085/sdk
2019-09-02T21:01:09.548 [MainProcess:PID-36035] INFO: Backup job failed.
- Examining carefully the log, isolating the processes related to ComponentScriptsBackup
# grep -i "ComponentScriptsBackup" backup.log
a) vum component backup starts and finishes successfully
2019-09-02T01:01:08.334 [ComponentScriptsBackup:PID-36375] INFO: Starting backup component: vum
2019-09-02T01:01:08.343 [ComponentScriptsBackup:PID-36375] INFO: Execute vum script: /etc/vmware/backup/component-scripts/vum/backup_restore.py --startBackup
2019-09-02T01:01:08.370 [ComponentScriptsBackup:PID-36375] INFO: Dispatching stream.
2019-09-02T01:01:09.316 [ComponentScriptsBackup:PID-36375] INFO: Component vum backup successful.
b) imagebuilder component starts and finishes successfully
2019-09-02T01:01:09.317 [ComponentScriptsBackup:PID-36375] INFO: Starting backup component: imagebuilder
2019-09-02T01:01:09.317 [ComponentScriptsBackup:PID-36375] INFO: Execute imagebuilder script: /etc/vmware/backup/component-scripts/imagebuilder/backup_restore.py --startBackup
2019-09-02T01:01:09.334 [ComponentScriptsBackup:PID-36375] INFO: Dispatching stream.
2019-09-02T01:01:10.633 [ComponentScriptsBackup:PID-36375] INFO: Component imagebuilder backup successful.
c) imagebuilder component starts but there's not a line with INFO: Component rdb backup successful
2019-09-02T01:01:10.637 [ComponentScriptsBackup:PID-36375] INFO: Starting backup component: rbd
2019-09-02T01:01:10.637 [ComponentScriptsBackup:PID-36375] INFO: Execute rbd script: /etc/vmware/backup/component-scripts/rbd/rbd-backup-restore --startBackup
2019-09-02T01:01:10.643 [ComponentScriptsBackup:PID-36375] INFO: Dispatching stream.
--> Missing a line with INFO: Component rdb backup successful