Symptoms:
This issue only happens when all of the following conditions are met.
1. There are 1 or more supervisor namespaces that use the "Override Supervisor Network" checkbox when their supervisor namespace was created.
You can tell if a namespace is overridden or not from the configure tab on the namespace in the GUI.
Without override:
With override:
2. On vCenter the /var/log/vmware/wcp/wcpsvc.log shows the cluster state flapping from configuring to running and back to configuring. You can verify this with the following command:
# cat /var/log/vmware/wcp/wcpsvc.log | grep "is changed from ConfigStatus"
wcpsvc.log:[timestamp] info wcp [kubelifecycle/kube_instance_conditions.go:90] Config status for WCP cluster <CLUSTER_UUID> is changed from ConfigStatus CONFIGURING to ConfigStatus RUNNING wcpsvc.log:[timestamp] info wcp [kubelifecycle/kube_instance_conditions.go:90] Config status for WCP cluster <CLUSTER_UUID> is changed from ConfigStatus RUNNING to ConfigStatus CONFIGURING wcpsvc.log:[timestamp] info wcp [kubelifecycle/kube_instance_conditions.go:90] Config status for WCP cluster <CLUSTER_UUID> is changed from ConfigStatus CONFIGURING to ConfigStatus RUNNING wcpsvc.log:[timestamp] info wcp [kubelifecycle/kube_instance_conditions.go:90] Config status for WCP cluster <CLUSTER_UUID> is changed from ConfigStatus RUNNING to ConfigStatus CONFIGURING
3. From inside of one of the supervisor control plane VMs, the script below returns that the "EXTERNAL_IP_POOLS_LB" is changing intermittently. To run the script ssh into a supervisor control plane VM as root and copy/paste the following script into a file named check-diff.sh:
$ vi check-diff.sh
#! /usr/bin/env bash
if [ $# -ne 1 ]; then
echo "Usage: output-diffs.sh <file>"
exit 1
fi
file=$1
if [ ! -f "$file" ]; then
echo "File not found: $file"
exit 1
fi
cmd="stat -c %Y $file" # Linux
currTs=$($cmd)
updatesToCollect=5
i=1
cat "$file" > "$file.update0"
while [ "$i" -lt "$updatesToCollect" ]; do
newTs=$($cmd)
if [ "$newTs" != "$currTs" ]; then
currTs=$newTs
echo "Detected change in \"$file\" at $(date)"
cat "$file" > "$file.update$i"
i=$((i+1))
fi
sleep 1
done
Then make the file executable
$ chmod +x check-diff.sh
Then run the command against the node-config file.
$ ./check-diff.sh /dev/shm/wcp_decrypted_data/node-config
Wait a few minutes to see if the script returns that it is detecting a change. If you are not seeing any messages that means that the file is staying static. You can ctrl+c out of the script if nothing happens in 4-5 minutes. Otherwise it will detect 5 changes and then stop.:
root@<SUPERVISOR_HOSTNAME> [ ~ ]# bash ./check-diff.sh /dev/shm/wcp_decrypted_data/node-config
Detected change in "/dev/shm/wcp_decrypted_data/node-config" at Thu Aug 24 21:39:28 UTC 2023
Detected change in "/dev/shm/wcp_decrypted_data/node-config" at Thu Aug 24 21:39:42 UTC 2023
Detected change in "/dev/shm/wcp_decrypted_data/node-config" at Thu Aug 24 21:39:44 UTC 2023
Detected change in "/dev/shm/wcp_decrypted_data/node-config" at Thu Aug 24 21:39:46 UTC 2023
If you see that there are changes, run a grep to validate that the change is on the EXTERNAL_IP_POOLS_LB
root@<SUPERVISOR_HOSTNAME> [ ~ ]# grep -e '^EXTERNAL_IP_POOLS_LB' /dev/shm/wcp_decrypted_data/*
dev/shm/wcp_decrypted_data/node-config.update0:EXTERNAL_IP_POOLS_LB = <IP_1>/<NETMASK>,
<IP_2>/<NETMASK>,
<IP_3>/<NETMASK>
dev/shm/wcp_decrypted_data/node-config.update0:EXTERNAL_IP_POOLS_LB = <IP_2>/<NETMASK>,
<IP_1>/<NETMASK>,
<IP_3>/<NETMASK>
dev/shm/wcp_decrypted_data/node-config.update0:EXTERNAL_IP_POOLS_LB = <IP_1>/<NETMASK>,
<IP_3>/<NETMASK>,
<IP_2>/<NETMASK>
* Notice how the IP_# ordering changes on each line
VMware vSphere prior to 8.0 U2b
Issue is fixed in vCenter Server 8.0 U2b
Please contact VMware by Broadcom support for assistance in resolving this issue.