VKS Guest Cluster Deployment Fails During TLS Handshake on VPC Networks
search cancel

VKS Guest Cluster Deployment Fails During TLS Handshake on VPC Networks

book

Article ID: 434469

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

VKS guest cluster creation fails on a VPC network. While the Control Plane (CP) node is successfully provisioned and assigned an IP address on the private VPC subnet, no containers run. Diagnostic logs and manual tests reveal the following:

  • cloud-init-out.log on the CP node displays an SSL connection timeout.
  • nc (netcat) confirms TCP port 6443 is open and reachable on the Supervisor VIP.
  • openssl s_client -showcerts connects to the VIP but fails to pull the certificate chain.
  • curl requests to the VIP hang during the TLS handshake VKS Cluster Deployments.

Environment

VMware NSX

VMware vSphere Kubernetes

Cause

The failure is caused by an MTU mismatch (MTU blackholing) on the physical network infrastructure. Kubernetes overlay networks and NSX-T Geneve encapsulation require Jumbo Frames to transmit large payloads, such as TLS certificate chains, without fragmentation. If the physical switches (e.g., Aruba) are not configured for an MTU of 9000, these larger packets are dropped, preventing the VKS CP node from completing the SSL handshake with the Supervisor cluster VKS Cluster Deployments.

Resolution

To resolve this issue, ensure Jumbo Frames are configured end-to-end:

  1. Configure Physical Switches: Apply a minimum MTU of 1600 (typically 9000) on all physical switch ports and ESXi host uplinks carrying overlay and management traffic.
  2. Verify MTU Path: Use vmkping with the "do not fragment" flag to confirm that payloads of 1473 bytes or greater can pass between ESXi hosts.
  3. Validate TLS Connectivity: From the workload network, run: openssl s_client -showcerts -connect <Supervisor-VIP>:6443 Confirm the certificate chain is successfully retrieved.
  4. Redeploy: Retry the VKS Guest Cluster deployment VKS Cluster Deployments.

Additional Information

Troubleshooting MTU issues and fragmentation in NSX fabric