Supervisor upgrade stuck with error "System error occurred on Master node with identifier ###################"
search cancel

Supervisor upgrade stuck with error "System error occurred on Master node with identifier ###################"

book

Article ID: 431424

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • Supervisor upgrade gets halted with error on UI "Configuration error
    System error occurred on Master node with identifier "###################". Details: Base configuration of node "###################" failed as a Kubernetes node. See /var/log/vmware-imc/configure-wcp.stderr on control plane node "###################" for more information.

  • configure-wcp.stdout shows below in the Supervisor Control Plane nodes:

    {"error": "Exception", "message": ". Failed to sync images: vmware/kubectl-plugin-vsphere:<version>", "backtrace": ["  File \"/usr/lib/vmware-wcp/upgrade/upgrade-ctl.py\", line 243, in main\n    syncer.sync(dry_run=args.dry_run)\n", "  File \"/usr/lib/vmware-wcp/upgrade/imagesync.py\", line 414, in sync\n    raise Exception('. Failed to sync images: %s' %','.join(failedImages))\n"]}

  • We observed errors as below in var/log/vmware/upgrade-ctl-cli.log:

    <DATE:Time> ERROR imagesync: Error while syncing vmware/kubectl-plugin-vsphere:<version>
    Traceback (most recent call last):
      File "/usr/lib/vmware-wcp/upgrade/imagesync.py", line 394, in sync
        self.syncSingleImageSkopeo(image)
      File "/usr/lib/vmware-wcp/upgrade/imagesync.py", line 271, in syncSingleImageSkopeo
        self.retry(func=sync, logger=logger)
      File "/usr/lib/vmware-wcp/upgrade/imagesync.py", line 68, in retry
        return func()
      File "/usr/lib/vmware-wcp/upgrade/imagesync.py", line 270, in sync
        run(cmd)
      File "/usr/lib/vmware-wcp/upgrade/comphelper.py", line 82, in run
        raise Exception(exMsg)
    Exception: Failed to run command: ['/usr/local/bin/skopeo', '--insecure-policy', 'copy', '--dest-tls-verify=false', 'docker://<IP>:5000/vmware/kubectl-plugin-vsphere:<version>', 'docker://127.0.0.1:5002/vmware/kubectl-plugin-vsphere:<version>'] ret=2 out= err=time="<DATE:Time>" level=fatal msg="initializing source docker://<IP>:5000/vmware/kubectl-plugin-vsphere:<version>: reading manifest <version> in <IP>:5000/vmware/kubectl-plugin-vsphere: manifest unknown"

Environment

vSphere Supervisor

Cause

The upgrade process is blocked due to a missing container image, vmware/kubectl-plugin-vsphere:<version>, on one of the cluster Control Plane nodes, which is required for the upgrade to complete.

Resolution

If you encounter the above issue, please contact Broadcom Support.