Server Pools (specifically port 6443) associated with any Tanzu clusters consistently remain in an "In Progress" state and do not transition to a completed or stable.
The update-controller in the Supervisor sync loop is stuck in an infinite loop.
var/log/update-controller/sync.log
YYYY-MM-DDTHH:MM:SS INFO network_setting: Network setting changed for if eth0YYYY-MM-DDTHH:MM:SS INFO network_setting: Network setting changed for if eth0YYYY-MM-DDTHH:MM:SS INFO network_setting: Network setting changed for if eth0YYYY-MM-DDTHH:MM:SS INFO network_setting: Network setting changed for if eth0
journalctl logs confirm that the rule is being added and then removed shortly after, by systemd-networkd
# journalctl --since today
YYYY-MM-DDTHH:MM:SS <Node-id> systemd-networkd[XXXX]: eth1: Removing route: dst: XX.XX.XX.XX/XX, src: n/a, gw: n/a, prefsrc: n/a, scope: link, table: 200, proto: static, type: unicastYYYY-MM-DDTHH:MM:SS <Node-id> systemd-networkd[XXXX]: eth1: Removing route: dst: n/a, src: n/a, gw: , prefsrc: n/a, scope: global, table: 200, proto: static, type: unicastXX.XX.XX.XXYYYY-MM-DDTHH:MM:SS systemd-networkd[XXXX]: Removing routing policy rule: priority: 0, 0.0.0.0/0 -> XX.XX.XX.XX/XX, iif: n/a, oif: n/a, table: 200<Node-id>YYYY-MM-DDTHH:MM:SS <Node-id> systemd-networkd[XXXX]: Removing routing policy rule: priority: 0, 0.0.0.0/0 -> XX.XX.XX.XX/XX, iif: n/a, oif: n/a, table: 200
VMware vSphere Kubernetes Service - Earlier than vCenter 8u3e (8.0.3.00500)
If there is no sso domain change that needs to applied in /etc/vmware/wcp/wcp-schedext-admission-controller-user-whitelist, then the function returns updated=False, which translates to sync retry = True causing update-controller sync loop to retry forever.
To fix this issue update VC to version 8u3e (8.0.3.00500)
Follow below steps to workaround this issue:
2. Backup the target file
cp /usr/lib/vmware-wcp/update-controller/sync.py ~
3. Comment out the line number
vi /usr/lib/vmware-wcp/update-controller/sync.py
520 # Sync if domain changes
521 #retry = self.sync_sso_domain_change(messages) or retry # <----- comment out this code
4. Restart wcp-sync process
systemctl restart wcp-sync
systemctl status wcp-sync
5. Repeat this operation for the remaining 2 Supervisors.