1. The "NCP down" alarm in the NSX alarm dashboard indicates that the NSX Manager is unable to communicate with the Network Container Plugin.
"summary": "Manager Node has detected the NCP is down or unhealthy.",
"description": "Manager Node has detected the NCP is down or unhealthy.",
"recommended_action": "To find the clusters which are having issues, please use the NSX UI and navigate to the Alarms page. The Entity name value for this alarm instance identifies the cluster name. Or invoke the NSX API GET /api/v1/systemhealth/container-cluster/ncp/status to fetch all cluster statuses and determine the name of any clusters that report DOWN or UNKNOWN. Then on the NSX UI Inventory | Container | Clusters page find the cluster by name and click the Nodes tab which lists all Kubernetes and PAS cluster members. For Kubernetes cluster: 1. Check NCP Pod liveness by finding the K8s master node from all the cluster members and log onto the master node. Then invoke the kubectl command `kubectl get pods --all-namespaces`. If there is an issue with the NCP Pod, please use kubectl logs command to check the issue and fix the error. 2. Check the connection between NCP and Kubernetes API server. The NSX CLI can be used inside the NCP Pod to check this connection status by invoking the following commands from the master VM. `kubectl exec -it <NCP-Pod-Name> -n nsx-system bash` `nsxcli` `get ncp-k8s-api-server status` If there is an issue with the connection, please check both the network and NCP configurations. 3. Check the connection between NCP and NSX Manager. The NSX CLI can be used inside the NCP Pod to check this connection status by invoking the following command from the master VM. `kubectl exec -it <NCP-Pod-Name> -n nsx-system bash` `nsxcli` `get ncp-nsx status` If there is an issue with the connection, please check both the network and NCP configurations. For PAS cluster: 1. Check the network connections between virtual machines and fix any network issues. 2. Check the status of both nodes and services and fix crashed nodes or services. Invoke the command `bosh vms` and `bosh instances -p` to check the status of nodes and services.",
2. NCP Pods are found to be in CrashLoopBackOff state
kubectl get pods
3. Container logs reports the below error message, where it explains the clock skew problem
kubectl logs <ncp-pod-name> -n nsx-system
[ncp GreenThread-1 I] nsx_ujo.ncp.election Initialized election profile election-lock-domain-c31362-########-4f6e-4b36-####-############
[ncp GreenThread-1 I] nsx_ujo.ncp.k8s.kubernetes HTTP session did not have a 'Content-type' header
[ncp GreenThread-1 I] nsx_ujo.ncp.k8s.kubernetes HTTP session did not have a 'Content-type' header
[ncp MainThread W] nsx_ujo.ncp.vc.session Failed to get JWT token: Failed SAML HoK request: Failed to get or renew SAML HoK from STS: SoapException:
faultcode: ns0:InvalidTimeRange
faultstring: The token authority rejected an issue request for TimePeriod [startTime=Sat May 31 05:38:09 GMT 2025, endTime=Sat May 31 05:48:09 GMT 2025] :: The requested token start time differs from the issue instant more than the acceptable deviation (clock tolerance) of 600000 ms. Requested token start time=Sat May 31 05:38:09 GMT 2025, issue instant time=Sat May 31 06:27:35 GMT 2025. This might be due to a clock skew problem.
faultxml: <?xml version='1.0' encoding='UTF-8'?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/"><S:Body><S:Fault xmlns:ns4="http://www.w3.org/2003/05/soap-envelope"><faultcode xmlns:ns0="http://docs.oasis-open.org/ws-sx/ws-trust/200512">ns0:InvalidTimeRange</faultcode><faultstring>The token authority rejected an issue request for TimePeriod [startTime=Sat May 31 05:38:09 GMT 2025, endTime=Sat May 31 05:48:09 GMT 2025] :: The requested token start time differs from the issue instant more than the acceptable deviation (clock tolerance) of 600000 ms. Requested token start time=Sat May 31 05:38:09 GMT 2025, issue instant time=Sat May 31 06:27:35 GMT 2025. This might be due to a clock skew problem.</faultstring></S:Fault></S:Body></S:Envelope>., will retry after 120 seconds
[ncp GreenThread-1 I] nsx_ujo.ncp.k8s.kubernetes HTTP session did not have a 'Content-type' header
VMware vSphere with Tanzu
VMware NSX
Supervisor cluster was running behind the actual timestamp.
NTP server issue created a clock skew problem
Validate NTP connections across NSX Manager, Vcenter and Supervisor cluster.
Below commands can be used to validate the same.
timedatectl show
timedatectl status
timedatectl timesync-status
timedatectl show-timesync kubectl exec it <ncp-pod-name> -n nsx-system -c nsx-ncp -- nsxcli -c get ncp-nsx status
kubectl exec -it <ncp-pod-name> -n nsx-system -c nsx-ncp -- nsxcli -c get ncp-k8s-api-server status
Output ref:
root@4####6e2##################dc75 [ ~ ]# k exec -it nsx-ncp-pod -n vmware-system-nsx -c nsx-ncp -- nsxcli -c get ncp-nsx status
Mon Jun 02 2025 UTC 08:17:11.661
NSX Manager status:
10.##.##.##:443: Healthy
10.##.##.##:443: Healthy
10.##.##.##:443: Healthy
10.##.##.##:443: Healthy
root@4####6e2##################dc75 [ ~ ]# k exec -it nsx-ncp-pod -n vmware-system-nsx -c nsx-ncp -- nsxcli -c get ncp-k8s-api-server status
Mon Jun 02 2025 UTC 08:17:42.030
Kubernetes ApiServer status: Healthy
root@4####6e2##################dc75 [ ~ 1#