vSphere Pods fail to get created showing 'FailedRealizeNSXResource'
search cancel

vSphere Pods fail to get created showing 'FailedRealizeNSXResource'

book

Article ID: 318702

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service

Issue/Introduction

 

  • vSphere Pods fail to get created showing 'FailedRealizeNSXResource' in the pod events .
  • Objects e.g. T1, segments etc. appear to be created correctly on the NSX side.
  • Embedded Harbor may fail to complete enablement and its pods show as pending with a message similar to the following in the /var/log/vmware/wcp/wcp-audit.log on the vCenter Server:
2021-06-10T08:40:48.741Z debug wcp [log/logger.go:70] [opID=vapi] Sending response with output {"output":[{"STRUCTURE":{"com.vmware.vcenter.namespaces.events.events.event":{"component":"kubelet","count":17,"kind":"Pod","last_time_stamp":1###314207,"message":"cfgAgent returned CONFIG_INEXISTENCE","name":"harbor-1###5678-harbor-redis-0","reason":"NetworkNotReady","type":"Warning"}}}
  • The vmware-system-nsx_nsx-ncp pod logs show the following error for one (or multiple) of the ESXi hosts in the supervisor vSphere cluster (where e.g. domain-c## is the managed object ID of the vSphere cluster):
2021-06-10T15:14:29.107200827Z stderr F [ncp GreenThread-35 E errorCode="NCP00010"] nsx_ujo.ncp.nsx.policy.node_service Failed to get segment port id or tn id for node <esxi-fqdn> in cluster domain-c30:00###456-1##4-5##8-a##d-1###be4###67


NOTE: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vSphere 8.0 with Tanzu
VMware vSphere 7.0.x
VMware vSphere 7.0 with Tanzu

Cause

  • This issue can occur if the ESXi hostname is specified in upper case while the NCP is using lower case.
  • This issue can also occur if the ESXi hostname mismatches when specified as shortname vs. FQDN between NCP and vCenter.
  • To check this login to the ESXi host DCUI
  • This can also occur if the ESXi hostname changes during the "add host" operation
  • If there is no case mismatch identified then there may be other causes for the failures which need to be investigated further.

Resolution

Modify the ESXi hostname to use lowercase instead of uppercase characters to match the entry in the NCP logs. Or modify the ESXi hostname to match FQDN.

This issue is resolved in vSphere ESXi 7.0 U2c (build number 18426014).

Workaround:

1. To view ESXi hostname in vCenter, navigate to the Hosts & Clusters view in Inventory -> Select the ESXi host -> select the Configure tab -> under Networking, select TCP/IP Configuration -> expand the Default TCP/IP Stack -> verify the Hostname and Domain fields.

2. To view the node name in NSX manager (NCP), from NSX Manager GUI navigate to Inventory -> Containers -> Clusters -> under the Nodes column, select the hyperlinked number of nodes to view the nodenames.
 
  • Alternatively, view the NCP node name by logging into the Supervisor cluster context and running kubectl get nodes

3. Compare the ESXi hostname in vCenter with the node names in NSX for case sensitivity as well as FQDN vs. shortname mismatch.


If there is a mismatch identified in the above steps, use the below procedure to correct the mismatch:

1. Open port 10250 bidirectionally between Supervisor Nodes and ESXi

2. Confirm port is opened using curl to test from Supervisor CP Node SSH session
 
  • # curl -v https://<ESXI_HOSTNAME>:10250

3. Place 1 host in Maintenance Mode

4. Once the host enters Maintenance Mpde; from Supervisor Cluster context, delete the node using kubectl delete node <ESXI_NODE_NAME>

5. From vCenter Web client, click on ESXi host -> Configure -> Networking -> TCP/IP Configuration -> Default stack -> Edit -> change name to lowercase, or change the domain to match the FQDN.

6. Exit ESXi Maintenance Mode. When the host exits Maintenance Mode, it will automatically be added back to the Supervisor Cluster as a worker node.


Perform this operation on all ESXi hosts 1 by 1

Additional Information

Impact/Risks:
Modifying the ESXi hostname may result in a brief network outage while the management network is being restarted.