New deployment of vSphere with Tanzu fails.
The Kubernetes status for Supervisor shows messages like:
A general system error occurred. Error message: context deadline exceeded.
No node on Supervisor 'supervisorXX' is accepting vSphere Pods. See Node specific messages for more details
See screenshot for reference:
vCenter 8.0 U3
Checking the ESXI host logs for the ESXI hosts in the cluster, we can check /var/run/log/spherelet.log
the following logging is observed:
2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: E0115 07:00:14.977112 3272467 reflector.go:147] k8s.io/client-go/informers/factory.go:154: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.x.x.x:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dosi6227.de-prod.dk&limit=500&resourceVersion=0": dial tcp 10.x.x.xxx:6443: i/o timeout
2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: W0115 07:00:14.976144 3272467 reflector.go:539] k8s.io/client-go/informers/factory.go:154: failed to list *v1.Service: Get "https://10..x.x.x:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10..x.x.x:6443: i/o timeout
2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: I0115 07:00:14.977162 3272467 trace.go:236] Trace[203089729]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:154 (15-Jan-2025 06:59:44.970) (total time: 30007ms):
2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: Trace[203089729]: ---"Objects listed" error:Get "https://10.x.x.x:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.x.x.x:6443: i/o timeout 30006ms (07:00:14.976)
In the logs for the Supervisor Control Plane VM in the WCP log bundle at the following location /var/log/pods/kube-system_kube-controller-manager-xxxx/kube-controller-manager/0.log/
, the following logging is observed:
2025-01-09T07:54:27.349729821Z stderr F E0109 07:54:27.349582 1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.x.x.x:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.x.x.x:6443: connect: connection refused
Communication between the supervisor and ESXI host is not possible over TCP port 6443
Ensure the required port 6443 is open to allow communication between Supervisor and ESXI and for successful deployment
Refer to ports.broadcom.com