NSX NCP Pod Fails with Certificate Expired Error After vCenter Certificate Replacement
search cancel

NSX NCP Pod Fails with Certificate Expired Error After vCenter Certificate Replacement

book

Article ID: 441710

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

In a VMware Cloud Foundation (VCF) or vSphere Foundation environment with vSphere Kubernetes Service (VKS) / Supervisor Cluster enabled, the NSX Container Plug-in (NCP) pod encounters continuous validation failures or crash loops after a vCenter Server Appliance intermediate CA certificate replacement.

nsx-ncp pod logs:
[ncp MainThread I] nsx_ujo.ncp.vc.session Refreshing token and re-instantiating TESSession
[ncp MainThread I] nsx_ujo.ncp.vc.session Retrieving VC Credentials for the first time
[ncp MainThread W] nsx_ujo.ncp.vc.session Failed to get JWT token: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007),

Environment

VMware Cloud Foundation
vSphere Kubernetes Service (VKS) 
NSX Container Plug-in (NCP)

Cause

The Workload Control Plane daemon (wcpsvc) does not automatically push updated root or intermediate certificate trust anchors to the Supervisor Control Plane VMs after an administrative certificate rotation, leaving active system containers executing with a stale in-memory Python SSL trust cache.

Resolution

To resolve the certificate mismatch and push the updated intermediate CA down to the Supervisor cluster components, perform the following steps:

1. restart the wcp service:
- ssh to vCenter
- vmon-cli --restart wcp

2. Recycle the Cluster Cert-Manager Infrastructure
- ssh to the supervisor
- kubectl rollout restart deployment cert-manager-cainjector -n vmware-system-cert-manager
- kubectl rollout restart deployment cert-manager -n vmware-system-cert-manager
validate that these start
kubectl get pods -n vmware-system-cert-manager

3. recycle the ncp deployment:
- ssh to the supervisor
- kubectl scale deployment nsx-ncp --replicas=0 -n vmware-system-nsx
- kubectl scale deployment nsx-ncp --replicas=1 -n vmware-system-nsx
-Check the initialization log trace to confirm successful JWT token extraction
kubectl logs -l app=nsx-ncp -n vmware-system-nsx -c nsx-ncp --tail=50