vCenter Backups with Supervisor backup option selected fail with: A general system error occurred: failed to run cmd /usr/lib/vmware-wcp/backup-restore/backup.py on CPVM VirtualMachine:vm-#####. Err: /usr/lib/vmware-wcp/backup-restore/backup.py: exit 1
This error is found on the vCenter backup logs under /var/log/vmware/applmgmt/backup.log
[ComponentScriptsBackup:PID-######] [Log::run:Log.py:64] ERROR: reason = 'failed to run cmd /usr/lib/vmware-wcp/backup-restore/backup.py on CPVM VirtualMachine:vm-#####. Err: /usr/lib/vmware-wcp/backup-restore/backup.py: exit 1'
The VM-ID for the Supervisor VM would be in the error message as vm-#####.
Upon reviewing the logs for that SV VM which would either be looked at directly via ssh* or from a log bundle under wcp-support-bundle-domain-c####-#######-##-#-##.tar_extracted/master-vm-#####.tgz
*To find the vmid from the UI, click on each SV VM in the vSphereUI and check the URL which will contain the vmid.
Checking this log on the SV VM /var/log/vmware/wcp/sv_backup_script.log shows the following error.
ERROR backup: Cmd ['/usr/local/bin/skopeo', '--insecure-policy', 'sync', '--src', 'docker', '--src-tls-verify=false', '--dest', 'dir', '--scoped', 'localhost:5000/vmware/registry-agent:0.0.10.17963681', '/var/lib/vmware/wcp/backup/tmpoxc9gwr7'] failed. ret=2, stdout=, stderr=time="YYYY-MM-DDTHH:MM:SSZ" level=info msg="Tag presence check" imagename="localhost:5000/vmware/registry-agent:0.0.10.17963681" tagged=true
time="2026-02-11T19:18:26Z" level=info msg="Copying image ref 1/1" from="docker://localhost:5000/vmware/registry-agent:0.0.10.17963681" to="dir:/var/lib/vmware/wcp/backup/tmpoxc9gwr7/localhost:5000/vmware/registry-agent:0.0.10.17963681"
time="YYYY-MM-DDTHH:MM:SSZ" level=fatal msg="Error copying ref \"docker://localhost:5000/vmware/registry-agent:0.0.10.17963681\": initializing source docker://localhost:5000/vmware/registry-agent:0.0.10.17963681: reading manifest 0.0.10.17963681 in localhost:5000/vmware/registry-agent: manifest unknown"
Traceback (most recent call last):
File "/usr/lib/vmware-wcp/backup-restore/backup.py", line 64, in run
result = subprocess.run(cmd, capture_output=True, check=True)
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/local/bin/skopeo', '--insecure-policy', 'sync', '--src', 'docker', '--src-tls-verify=false', '--dest', 'dir', '--scoped', 'localhost:5000/vmware/registry-agent:0.0.10.17963681', '/var/lib/vmware/wcp/backup/tmpoxc9gwr7']' returned non-zero exit status 2.
Note down the name of the manifest for the resolution section. It may vary from case to case, but in this example it would be registry-agent:0.0.10.17963681
vSphere Supervisor 7.x, 8.x, 9.x
Issue is due to a job object on the supervisor cluster that was created from enabling the embedded harbor registry (Deprecated in 2023 in favor for Harbor as a Supervisor Service) that failed to be deleted when the embedded habor was deactivated.
Find the job object that references the image from the error found in /var/log/vmware/wcp/sv_backup_script.log and remove the stale job object.
1. SSH into the SV VM's via https://knowledge.broadcom.com/external/article?legacyId=90194
2. Check all job objects under the vmware-system-registry namespace via
kubectl get jobs -n vmware-system-registry -o yaml | less
Then within the less shell do a forward search ( / ) for a job that references the manifest.
/registry-agent:0.0.10.17963681
Which should return 1 or 2 jobs that look similar to the following.
---
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "YYYY-MM-DDT00:00:00Z"
labels:
controller-uid: ########-####-####-####-############
job-name: harbor-##########-controller-registry-##########
name: harbor-##########-controller-registry-##########
namespace: vmware-system-registry
resourceVersion: "389741805"
uid: ########-####-####-####-############
spec:
backoffLimit: 6
completionMode: NonIndexed
completions: 1
manualSelector: false
parallelism: 1
podReplacementPolicy: TerminatingOrFailed
selector:
matchLabels:
controller-uid: ########-####-####-####-############
suspend: false
template:
metadata:
creationTimestamp: null
labels:
controller-uid: ########-####-####-####-############
job-name: harbor-##########-controller-registry-##########
spec:
containers:
- args:
- -test.coverprofile
- /tmp/cover.out
command:
- /registry-agent
env:
- name: CRON_JOB
value: ROTATE_SYSTEM_ADMIN_CREDENTIAL
- name: NAMESPACE
value: vmware-system-registry-##########
- name: REGISTRY_DOMAIN
value: ###.###.###.###
- name: REGISTRY_NAME
value: harbor-########
image: localhost:5000/vmware/registry-agent:0.0.10.17963681
imagePullPolicy: IfNotPresent
name: harbor-1459597518-controller-registry-credential-rotate-job
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
nodeSelector:
node-role.kubernetes.io/master: ""
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: kubeadmNode
operator: Equal
value: master
status:
completionTime: "YYYY-MM-DDT00:00:00Z"
conditions:
- lastProbeTime: "YYYY-MM-DDT00:00:00Z"
lastTransitionTime: "YYYY-MM-DDT00:00:00Z"
status: "True"
type: Complete
startTime: "YYYY-MM-DDT00:00:00Z"
succeeded: 1
---
Notice that the image line matches the error message.
image: localhost:5000/vmware/registry-agent:0.0.10.17963681
Using the name/namespace line, delete the job or jobs found.
name: harbor-##########-controller-registry-##########
namespace: vmware-system-registry
Delete command is
kubectl delete job -n vmware-system-registry harbor-##########-controller-registry-##########
After this re-run the backup. It may fail on a different image and the process to find and remove stale jobs needs to be re-ran.
If any stale jobs are found outside of the vmware-system-registry namespace are found to be blocking the backup, please open a case with Broadcom technical support to investigate further. Do not delete them.