Error "Unable to Connect to Bootstrap API Server". Supervisor HA Enablement Fails in VCF 9.0.x with NSX-VPC
search cancel

Error "Unable to Connect to Bootstrap API Server". Supervisor HA Enablement Fails in VCF 9.0.x with NSX-VPC

book

Article ID: 436220

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • When enabling High Availability for the Supervisor control plane nodes in a VCF 9.0.1 environment utilizing NSX with VPC setup, the two newly deployed nodes may remain in an error state.

  • The Supervisor cluster status will show a "Configuration error" for the new nodes, indicating they are unable to connect to the Bootstrap API Server on the primary control plane node.

    Customized guest of Supervisor Control plane VM
    Configured Supervisor Control plane VM's Management Network
    Configured Supervisor Control plane VM as Kubernetes Control Plane Node
     • Configuration error (since 3/18/2026, 9:04:09 PM)
     • Unable to connect to Bootstrap API Server (https://<IP of the first CP node>:443/wcp/loginbanner) from control plane VM <DNS-ID>. Error: Get "https://<IP of the first CP node>:443/wcp/loginbanner": dial tcp <IP of the first CP node>:443: connect: connection refused.
    Configured Supervisor Control plane VM's Workload Network

  • The primary node (eth0) cannot receive ARP responses or pings from the secondary nodes without specifying the interface.

  • Secondary nodes are missing the workload network interface (eth1).

  • Per the /var/log/update-controller/sync.log, the update-controller on the primary node remains in a loop because the VIP certificate is empty or not generated.

Environment

VMware Cloud Foundation 9.0.x

VMware NSX with VPC

Cause

The issue is caused by a network configuration conflict where the VPC External IP Block CIDR overlaps with the Supervisor Management Subnet. Because of this overlap:

  1. The primary CPVM creates routing rules (ip rule) that attempt to route traffic for the overlapping slice through the VPC workload network (eth1) instead of the management network (eth0).
  2. When secondary nodes attempt to communicate via eth0, the primary node identifies these as "martian" packets and drops them due to strict Reverse Path Filtering (rp_filter = 1).
  3. Because initial bootstrap communication fails, the Supervisor service (wcpsvc) never proceeds to add the eth1 interface to the new nodes.

Resolution

To resolve this issue, ensure there is no overlap between the Supervisor Management Subnet and the VPC External IP CIDRs.

  1. Identify the Overlap: Check the IPBlocksInfo on the cluster to confirm the External IP CIDR range. Use the command below to confirm the same.

    kubectl get ipblocksinfo.crd.nsx.vmware.com -n [namespace] ip-blocks-info -o yaml

  2. Verify Routing Rules: On the primary Supervisor VM, check for conflicting IP rules. Use the command below to confirm the same.

    ip rule show and ip route show table 200

  3. Correct the configuration: Reconfigure the VPC External IP Block to a CIDR range that does not overlap with the Supervisor Management network or any other management infrastructure.

  4. Re-enable HA: After resolving the IP overlap, you may need to restart the HA enablement process or reboot the primary Supervisor VM to clear the ARP cache and martian packet drops.