vCenter Manual Remediation In Case Harbor Upgrade Failed
search cancel

vCenter Manual Remediation In Case Harbor Upgrade Failed

book

Article ID: 311959

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

To document a critical bug that may arise during version upgrade of WCP cluster wherein, harbor component fails to get upgraded leading to WCP Upgrade Namespaces Cluster task getting stuck at 50%. 

Symptoms:
In WCP supervisor master Virtual Machine, path /var/log/vmware/upgrade-ctl-compupgrade.log the following logs can be seen:

2022-02-13T07:18:16.341Z INFO harborup: Upgrade pending for 1 Harbor registry
2022-02-13T07:19:15.527Z ERROR harborup: Failed to check harbor upgrade status
Traceback (most recent call last):
  File "/usr/lib/vmware-wcp/objects/PodVM/45-harbor/harborup.py", line 84, in doUpgrade time.sleep(SLEEP_TIME_SECONDS)
  File "/usr/lib/vmware-wcp/upgrade/comphelper.py", line 394, in handler
    (compName, timeoutSec))

comphelper.TimeoutException: Component HarborUpgrade upgrade did not finish within 360 seconds timeout deadline
2022-02-13T07:19:15.529Z INFO comphelper: Resetting timeout handler
2022-02-13T07:19:15.529Z ERROR compupgrade: {"error": "Exception", "message": "Failed to check harbor upgrade status", "backtrace": ["  File \"/usr/lib/vmware-wcp/upgrade/compupgrade.py\", line 252, in do\n    comp.doUpgrade(upCtx)\n", "  File \"/usr/lib/vmware-wcp/objects/PodVM/45-harbor/harborup.py\", line 88, in doUpgrade\n    raise Exception(msg) from e\n"]}
2022-02-13T07:19:15.529Z INFO comphelper: Running ['/usr/local/bin/etcdctl', 'get', '/vmware/wcp/upgrade/components/status', '--print-value-only', '--command-timeout=30s', '--dial-timeout=5s']
2022-02-13T07:19:15.569Z DEBUG comphelper: ret=0 out={"controller_id": "4235837bdb94801cd2c655d1d91e09e5", "state": "in-progress", "messages": [], "started_at": "2022-02-13T07:12:28.604551Z", "last_modified": "2022-02-13T07:13:15.477197Z", "progress": {"NSXNCPUpgrade": {"status": "upgraded", "messages": []}, "VMwareSystemLoggingUpgrade": {"status": "upgraded", "messages": []}, "ImageControllerUpgrade": {"status": "upgraded", "messages": []}, "TelegrafUpgrade": {"status": "skipped", "messages": []}, "SphereletComponentUpgrade": {"status": "upgraded", "messages": []}, "SchedextComponentUpgrade": {"status": "upgraded", "messages": []}, "CsiControllerUpgrade": {"status": "upgraded", "messages": []}, "LoadBalancerApiUpgrade": {"status": "skipped", "messages": []}, "NetOperatorUpgrade": {"status": "upgraded", "messages": []}, "RegistryAgentUpgrade": {"status": "upgraded", "messages": []}, "HarborUpgrade": {"status": "processing", "messages": []}, "AppPlatformOperatorUpgrade": {"status": "pending", "messages": []}, "UCSUpgrade": {"status": "pending", "messages": []}, "CapwUpgrade": {"status": "pending", "messages": []}, "TkgUpgrade": {"status": "pending", "messages": []}, "VmOperatorUpgrade": {"status": "pending", "messages": []}, "TMCUpgrade": {"status": "pending", "messages": []}, "LicenseOperatorControllerUpgrade": {"status": "pending", "messages": []}, "PspOperatorUpgrade": {"status": "pending", "messages": []}, "NamespaceOperatorControllerUpgrade": {"status": "pending", "messages": []}}}
 err=


Environment

VMware vCenter Server 7.0.3

Cause

Failure is occurring due to the following error:
  • "Failed to install Harbor helm chart".
  • Helm uninstall old version Harbor failed or timed out. Harbor helm uninstall takes more than 30 minutes.

Resolution

There is no resolution for this issue, While the fix is being worked on please follow the workaround.

Workaround:
Please follow Harbor upgrade documentation for manual remediation: Upgrade Harbor and Migrate Data
In case of Helm uninstall failed, you need to check the environment to figure out why the old Harbor pods were not  deleted through helm uninstall, and safely uninstall old Harbor pods.
The Steps are: 
a. Take a backup of harbor PVC
b. Start WCP upgrade.
b. Incase harbor upgrade fails, install harbor with existing PVC.
For installing harbor with existing PVC refer: https://opengrok.eng.vmware.com/source/xref/main.perforce.1666/bora/vpx/wcp/registry-agent/src/vmware.com/registry-agent/pkg/controller/registry/harbor_helm_chart.go#180
This way the existing harbor projects will still be available.