New deployment of vSphere with Tanzu fails.
The Kubernetes status for Supervisor shows messages like:
A general system error occurred. Error message: context deadline exceeded.
No node on Supervisor 'supervisorXX' is accepting vSphere Pods. See Node specific messages for more details
See screenshot for reference:
vCenter 8.0 U3
Checking the ESXI host logs for the ESXI hosts in the cluster, we can check /var/run/log/spherelet.log the following logging is observed:
2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: E0115 07:00:14.977112 3272467 reflector.go:147] k8s.io/client-go/informers/factory.go:154: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.x.x.x:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dosi6227.de-prod.dk&limit=500&resourceVersion=0": dial tcp 10.x.x.xxx:6443: i/o timeout2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: W0115 07:00:14.976144 3272467 reflector.go:539] k8s.io/client-go/informers/factory.go:154: failed to list *v1.Service: Get "https://10..x.x.x:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10..x.x.x:6443: i/o timeout2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: I0115 07:00:14.977162 3272467 trace.go:236] Trace[203089729]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:154 (15-Jan-2025 06:59:44.970) (total time: 30007ms):2025-01-15T07:00:14.977Z No(13) spherelet[3272489]: Trace[203089729]: ---"Objects listed" error:Get "https://10.x.x.x:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.x.x.x:6443: i/o timeout 30006ms (07:00:14.976)
In the logs for the Supervisor Control Plane VM in the WCP log bundle at the following location /var/log/pods/kube-system_kube-controller-manager-xxxx/kube-controller-manager/0.log/, the following logging is observed:
2025-01-09T07:54:27.349729821Z stderr F E0109 07:54:27.349582 1 leaderelection.go:332] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.x.x.x:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.x.x.x:6443: connect: connection refused
Communication between the supervisor and ESXI host is not possible over TCP port 6443
To complete Supervisor deployment, the bidirectional networking connections are required between ESXi and Supervisors:
Check the ports connectivity between ESXi and Supervisor by following below steps:
1. Verify if ESXi can connect to the port 6443 of supervisor FIP by running this command in ESXi:
openssl s_client -connect <Supervisor-FIP>:6443
If the connection is successful, the first line of the output indicates CONNECTED
NOTE: Supervisor FIP can be retrieval by running the following command in vCenter Server:
/usr/lib/vmware-wcp/decryptK8Pwd.py
In the output:
root@vcenter [ ~ ]# /usr/lib/vmware-wcp/decryptK8Pwd.py
Read key from file
Connected to PSQL
Cluster: domain-c#: <supervisor cluster domain id>
IP: <Supervisor FIP>
PWD: <password>
2. Verify if both management and workload networks of Supervisor nodes can connect to the port 10250 of ESXi by following the article"vSphere Kubernetes Supervisor ESXi Host with Kubernetes status showing Node is not healthy and is not accepting pods. Details Kubelet stopped posting node status"
If ports connectivity is failed between ESXi and Supervisor, underlying networking need to be checked if they are routable and ensure no firewall blocking. For other ports requirement on various components, refer to Additional Information part.
Please refer to below document with more information:
This document addresses the common ports and protocols that are required for the various components of the Supervisor stack to work efficiently.
For Group Infrastructure, we provide details of all the traffic flows originating from the platform's various infrastructure components (such as ESXi and vCenter). These components must be comprehensively installed and configured before enabling the Supervisor.
| Port | Protocol | Source | Destination | Mandatory/Optional | Notes |
|---|---|---|---|---|---|
| 53 | UDP, TCP | ESXi Server(s) Mgmt IP | DNS | Mandatory | Must be enabled during initial infrastructure setup. |
| 123 | UDP | ESXi Server(s) Mgmt IP | NTP | Mandatory | Must be enabled during initial infrastructure setup. |
| 6443 | TCP | ESXi Server(s) Mgmt IP | Supervisor Mgmt IP Pool (VIP)* | Mandatory | Supervisor Mgmt IP Pool (VIP) is the floating IP in the Supervisor Mgmt IP Pool. This is for document purposes only. |
| 10250 | TCP | ESXi Server(s) Mgmt IP | Primary Workload Network IP Pool (Supervisor Service) | Mandatory | |
| 443 | TCP | vCenter | Internet | Optional | Egress Internet traffic. It may not be required for an airgapped setup. If so, would need private endpoints for corresponding services |
| 443, 902, 9080 | TCP | vCenter | ESXi Server(s) Mgmt IP | Mandatory | Must be enabled during initial infrastructure setup. |
| 443, 6443 | TCP | vCenter | Supervisor Mgmt IP Pool | ||
| 22, 443, 5000, 6443 | TCP | vCenter | Supervisor Mgmt IP Pool (VIP)* | Mandatory | Supervisor Mgmt IP Pool (VIP) is the floating IP in the Supervisor Mgmt IP Pool. This is for document purposes only. |