To document a critical bug that may arise during version upgrade of WCP cluster wherein, harbor component fails to get upgraded leading to WCP Upgrade Namespaces Cluster task getting stuck at 50%.
Symptoms:
In WCP supervisor master Virtual Machine, path /var/log/vmware/upgrade-ctl-compupgrade.log the following logs can be seen:
2022-02-13T07:18:16.341Z INFO harborup: Upgrade pending for 1 Harbor registry
2022-02-13T07:19:15.527Z ERROR harborup: Failed to check harbor upgrade status
Traceback (most recent call last):
File "/usr/lib/vmware-wcp/objects/PodVM/45-harbor/harborup.py", line 84, in doUpgrade time.sleep(SLEEP_TIME_SECONDS)
File "/usr/lib/vmware-wcp/upgrade/comphelper.py", line 394, in handler
(compName, timeoutSec))
comphelper.TimeoutException: Component HarborUpgrade upgrade did not finish within 360 seconds timeout deadline
2022-02-13T07:19:15.529Z INFO comphelper: Resetting timeout handler
2022-02-13T07:19:15.529Z ERROR compupgrade: {"error": "Exception", "message": "Failed to check harbor upgrade status", "backtrace": [" File \"/usr/lib/vmware-wcp/upgrade/compupgrade.py\", line 252, in do\n comp.doUpgrade(upCtx)\n", " File \"/usr/lib/vmware-wcp/objects/PodVM/45-harbor/harborup.py\", line 88, in doUpgrade\n raise Exception(msg) from e\n"]}
2022-02-13T07:19:15.529Z INFO comphelper: Running ['/usr/local/bin/etcdctl', 'get', '/vmware/wcp/upgrade/components/status', '--print-value-only', '--command-timeout=30s', '--dial-timeout=5s']
2022-02-13T07:19:15.569Z DEBUG comphelper: ret=0 out={"controller_id": "4235837bdb94801cd2c655d1d91e09e5", "state": "in-progress", "messages": [], "started_at": "2022-02-13T07:12:28.604551Z", "last_modified": "2022-02-13T07:13:15.477197Z", "progress": {"NSXNCPUpgrade": {"status": "upgraded", "messages": []}, "VMwareSystemLoggingUpgrade": {"status": "upgraded", "messages": []}, "ImageControllerUpgrade": {"status": "upgraded", "messages": []}, "TelegrafUpgrade": {"status": "skipped", "messages": []}, "SphereletComponentUpgrade": {"status": "upgraded", "messages": []}, "SchedextComponentUpgrade": {"status": "upgraded", "messages": []}, "CsiControllerUpgrade": {"status": "upgraded", "messages": []}, "LoadBalancerApiUpgrade": {"status": "skipped", "messages": []}, "NetOperatorUpgrade": {"status": "upgraded", "messages": []}, "RegistryAgentUpgrade": {"status": "upgraded", "messages": []}, "HarborUpgrade": {"status": "processing", "messages": []}, "AppPlatformOperatorUpgrade": {"status": "pending", "messages": []}, "UCSUpgrade": {"status": "pending", "messages": []}, "CapwUpgrade": {"status": "pending", "messages": []}, "TkgUpgrade": {"status": "pending", "messages": []}, "VmOperatorUpgrade": {"status": "pending", "messages": []}, "TMCUpgrade": {"status": "pending", "messages": []}, "LicenseOperatorControllerUpgrade": {"status": "pending", "messages": []}, "PspOperatorUpgrade": {"status": "pending", "messages": []}, "NamespaceOperatorControllerUpgrade": {"status": "pending", "messages": []}}}
err=