ESXi host reports NotReady status in Supervisor Cluster due to TCP 10250 port blockage
search cancel

ESXi host reports NotReady status in Supervisor Cluster due to TCP 10250 port blockage

book

Article ID: 436277

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VMware Cloud Foundation

Issue/Introduction

In a VMware Cloud Foundation (VCF) or vSphere with Tanzu environment, one or more ESXi hosts transition to a NotReady state. This state prevents the Supervisor Cluster from scheduling workloads, including vSphere Pods and Tanzu Kubernetes Grid (TKG) clusters, on the affected transport nodes.

  • Running kubectl get nodes from the Supervisor Control Plane identifies hosts with a NotReady
  • The spherelet service on the affected ESXi host is confirmed as running: /etc/init.d/spherelet status.
  • Connection attempts to the Spherelet API on port 10250 fail or timeout.

Validation: Identify the failure via the Supervisor's kubectl command:

kubectl get node
NAME                               STATUS   ROLES                  AGE    VERSION
:
:
esxi01.mylab.local                NotReady agent                  162d   v1.30.5-sph-806add6 ###<<<=== 'NotReady'

 

Environment

  • VMware Cloud Foundation (VCF) 4.x, 5.x
  • VMware vSphere Kubernetes Services

Cause

The Supervisor Control Plane VMs could not communicate with the Spherelet (the node agent) on the ESXi host over TCP Port 10250.
This port must be open bidirectionally across both the Management (eth0) and Workload (eth1) interfaces.
Firewall drops or NSX Distributed Firewall (DFW) rules blocking this port prevent heartbeat synchronization and node health updates.

Resolution

To resolve this issue, ensure TCP port 10250 is permitted throughout the network path between the Supervisor Control Plane VMs and the ESXi hosts.

1. Firewall Configuration: Verify and update physical firewalls and the NSX Distributed Firewall (DFW) to allow bidirectional traffic on TCP Port 10250 between the Supervisor Control Plane VM network and the ESXi Management/Workload networks.

2. Verify Connectivity from Supervisor Control Plane: SSH into a Supervisor Control Plane VM and test connectivity to the affected ESXi host on both interfaces:

   curl -v telnet://<REDACTED_ESXI_IP>:10250 --interface eth0
   curl -v telnet://<REDACTED_ESXI_IP>:10250 --interface eth1

3.Verify Connectivity from ESXi Host: SSH into the affected ESXi host and verify it can reach the Supervisor API Server and Kubelet:

  # Check access to Supervisor Cluster Floating IP (FIP)
    nc -z <REDACTED_FIP> 6443
    nc -z <REDACTED_FIP> 10250

  # Check access to individual Supervisor Control Plane VMs
    nc -z <REDACTED_SUP_VM_IP> 6443
    nc -z <REDACTED_SUP_VM_IP> 10250

4.Restart Management Agents (Optional): If connectivity is confirmed but the status remains NotReady, restart the spherelet service on the host:

  /etc/init.d/spherelet restart

 

Additional Information