NCP pod going into CrashLoopBackOff state
search cancel

NCP pod going into CrashLoopBackOff state

book

Article ID: 432998

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NCP pod is frequently going into CrashLoopBackOff state.
  • NCP logs show a connection issues to NSX Manager and vCenter.

    kubectl logs <ncp-pod-name> -n nsx-system
    Timestamp] stderr F [ncp GreenThread-59 I] vmware_nsxlib.v3.cluster Endpoint 'https://<NSX_Manager>:443' changing from state 'UP' to 'DOWN'
    [Timesamp] stdout F 2026-02-26 10:28:15.682 ESC[31mERRORESC[0m      ESC[33mjwt/vcclient.go:219ESC[0m        Failed to create VIM client     {"vimSdkURL": "https://<vCenter>:443/sdk", "error": "Post \"https://<vCenter>:443/sdk\": dial tcp <vvCenter>:443: i/o timeout"}

  • A ping from the Supervisor VM where NCP pod is running to NSX Manager or vCenter intermittently fails.
  • Packet captures shows that traffic reaches the NSX Manager and vCenter and reply is sent back through T0.
  • T0 shows multiple routes to the Supervisor subnet.

    edge(tier0_sr[X])> get route

    Total number of routes: X
    > * 172.16.0.0/24 [3/0]  via 100.64.0.1, linked-x 
    > * 172.16.0.0/24  [3/0] via 100.64.0.2, linked-x, 

  • Multiple VPCs exist within environment with overlapping subnets for the Supervisor VM.
  • Both VPCs are connected to the same T0.

Environment

VMware NSX 9

Cause

The IPAM workflow correctly allocates non-overlapping subnets within each block independently. If both blocks cover the same address range, the IPAM allocations from the two blocks will produced overlapping subnets across VPCs. When public subnets are created, the connected route is advertised from each TGW's VRF to the T0 via inter-VRF static routes.The T0 sees two equal-cost paths for the same prefix. This enables ECMP. As the supervisor is only connected to a single VPC, traffic is dropped when sent to second route.

Resolution

Remove the overlapping segment/subnet from the VPC where the Supervisor is not connected.