vCenter backup fails when including Supervisor cluster
search cancel

vCenter backup fails when including Supervisor cluster

book

Article ID: 388368

calendar_today

Updated On:

Products

vSphere with Tanzu

Issue/Introduction

  • vCenter backup fails for both manual as well scheduled backup.

  • When Supervisor is not included in the vCenter backup, the backup completes.

  • In /var/log/vmware/wcp/wcpsvc.log:

YYYY-MM-DD HH:MM:SS.964Z error wcp [backup/taker.go:166] [opID=backup-########-####-####-####-############] Backup failed for CPVM VirtualMachine:vm-<ID>. Error: failed to run cmd /usr/lib/vmware-wcp/backup-restore/backup.py on CPVM VirtualMachine:vm-<ID>. rc: 1, err: <nil>
YYYY-MM-DD HH:MM:SS.964Z error wcp [backup/jobs.go:166] [opID=backup-########-####-####-####-############] Failed to backup Supervisor ########-####-####-####-############. Err failed to run cmd /usr/lib/vmware-wcp/backup-restore/backup.py on CPVM VirtualMachine:vm-<ID>. rc: 1, err: <nil>

  • In var/log/vmware/applmgmt/backup.log:

YYYY-MM-DD HH:MM:SS.493 [20250131-091423-24322831] [ComponentScriptsBackup:PID-3398488] [ComponentScripts::ComponentScriptsBackup:ComponentScripts.py:106] ERROR: Component backup command "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py --backup" failed 1.
YYYY-MM-DD HH:MM:SS.493 [20250131-091423-24322831] [ComponentScriptsBackup:PID-3398488] [Log::run:Log.py:64] ERROR: }Failed to take Supervisor ########-####-####-####-############ backup: Supervisor backup task 'vim.Task:task-#####' failedFailed to write backup content: Supervisor backup task 'vim.Task:task-#####' failed
YYYY-MM-DD HH:MM:SS.493 [20250131-091423-24322831] [ComponentScriptsBackup:PID-3398488] [ComponentScripts::ComponentScriptsBackup:ComponentScripts.py:135] ERROR: Error during component supervisors backup
Underlying process status. rc: 1
stdout:
stderr:
Traceback (most recent call last)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ComponentScripts.py", line 110, in ComponentScriptsBackup
    raise BackupRestoreError(('Error during component %s backup' %
util.Common.BackupRestoreError: Error during component supervisors backup
Underlying process status. rc: 1
stdout:
stderr:
YYYY-MM-DD HH:MM:SS.499 [20250131-091423-24322831] [MainProcess:PID-3395528] [Proc::VerifyProcStatusAndGetArchive:Proc.py:159] ERROR: Error at process ComponentScriptsBackup; rc:1.

  • In /var/log/vmware/wcp/supervisors_backup_restore.log we see below entries:

YYYY-MM-DD HH:MM:SS,362 __main__ ERROR - Failed to take Supervisor ########-####-####-####-############ backup: Supervisor backup task 'vim.Task:task-#####' failed
Traceback (most recent call last):
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 276, in _take_supervisor_backup
    archive_id = self._wait_backup_job(task_ref)
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 339, in _wait_backup_job
    raise Exception("Supervisor backup task %s failed" % task_ref)
Exception: Supervisor backup task 'vim.Task:task-#####' failed
YYYY-MM-DD HH:MM:SS,364 __main__ ERROR - Failed to write backup content: Supervisor backup task 'vim.Task:task-#####' failed
Traceback (most recent call last):
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 383, in _backup_stream_writer
    raise Exception(str(archive_location_info))  # There was an error taking backup.
Exception: Supervisor backup task 'vim.Task:task-#####' failed
YYYY-MM-DD HH:MM:SS,370 __main__ ERROR - The operation failed with error Supervisor backup task 'vim.Task:task-#####' failed
Traceback (most recent call last):
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 595, in main
    bt.take()
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 258, in take
    self._backup_stream_writer()
  File "/etc/vmware/backup/component-scripts/wcp/supervisors_backup_restore.py", line 383, in _backup_stream_writer
    raise Exception(str(archive_location_info))  # There was an error taking backup.
Exception: Supervisor backup task 'vim.Task:task-#####' failed

  • After cleanup performed on the Supervisor CP nodes using KB 381590 , the backup still fails.



Environment

vSphere with Tanzu

Cause

The issue is caused due to object pointing to older registry agent image.
If the service associated with the object is no more in use, the registry agent fails to update to later images.

Resolution

  1. List the images for the registry:

    crictl images show | grep -i registry-agent

    Eg.
    crictl images show | grep -i registry-agent
    localhost:5000/vmware/registry-agent                                                    0.1.10.24211112
      
  2. List all the object in vmware-system-registry namespace and describe the objects to see if they have image for older registry agent.

    k get all -n vmware-system-registry

  3. In the describe of objects, check for Pod Template > Containers > Environment 

    Eg.
        Environment:
          POD_NAMESPACE:              (v1:metadata.namespace)
          REGISTRY_AGENT_IMAGE:      localhost:5000/vmware/registry-agent:0.1.10.24211112
          KUBERNETES_SERVICE_HOST:   127.0.0.1

  4. If you observe from above that any object has a different Tag, then contact Broadcom Support.