Command 'net-vdl2 -l -s <vDS_NSX_Name>' on ESXi host displays segment_ID as 0.0.0.0 and no VMKnic count (example in printscreen below):
In /var/log/run/vmkernel.log of an impacted ESXi host, you may see log messages similar to the log lines below (if the ESXi only is impacted and the vCenter DB is healthy, explained further in the workaround section).
2021-08-18T16:44:41.926Z cpu2:2098628 opID=756ad23e)WARNING: vxlan: VDL2PortPropSet:820: Failed to create VXLAN IP vmknic on port[0x3000004] of VDS[DvsPortset-0] : Operation already in progress
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Environment
VMware NSX for Data Center for vSphere 6.4.8 or 6.4.10. VMware vCenter Server 7.0.x. VMware vSphere ESXi 7.0.x.
Cause
The issue is introduced when a failure in resetting VXLAN data occurs, this is typically induced when there is an EAM VXLAN host scan is triggered, this can happen when a host is rebooted or placed in and out of maintenance mode, in NSX-v 6.4.8 and 6.4.10, the NSX Manager attempts to publish the VXLAN information but there is no DVS VXLAN information associated with the host in the publish event and the issue is introduced.
Resolution
This issue is resolved in VMware NSX-v 6.4.11 and later, available at Broadcom downloads.
Workaround
This problem can manifest in two ways:
Scenario 1: Only the ESXi host contains 'FF' values for VXLAN VMK adapters and the vCenter database is healthy.
Scenario 2: Both the ESXi host and vCenter database contain 'FF' values for VXLAN VMK adapters.
To validate which scenario you are impacted by, run each of the following two checks:
Via SSH access to an ESXi host experiencing VXLAN connectivity issues, run the command:
select opaque_data from vpx_dvs_opaque_data where opaque_data=decode('ff', 'hex');
Scenario 1 is present if nodatabase rows are returned BUT the ESXi host shows 'FF' values.
To resolve Scenario 1, it is recommended to place that ESXi host showing 'FF' values in Maintenance Mode and proceed to reboot the ESXi host.
Once rebooted, the ESXi host is expected to show a healthy return from:
net-dvs -l | grep 'vxlan.vmknic'
Scenario 2 is present if database rows are returned AND the ESXi host shows 'FF' values.
To resolve Scenario 2, please contact VMware Support and note this article, 319148 in the problem description.
Additional Information
There are three specific situations that are relevant to this particular issue, their explanations are below:
An environment is currently running NSX-v 6.4.8 or 6.4.10, the ESXi version is currently 6.x and an upgrade is planned for 7.x.
VMware recommends to upgrade the NSX-v 6.4.14 before upgrading either the vCenter or ESXi version to 7.x to avoid encountering this issue.
An environment is running ESXi /vCenter 7.x and NSX-v 6.4.8 or 6.4.10, a plan is in place to upgrade to ESXi to a newer minor release of 7.x.
VMware recommends to check the vCenter Database for incorrect vmknic values, using the validation steps outlined above before moving forward with the ESXi upgrade.
If scenario 2 is encountered, please file a support request, if not, please continue with the upgrade.
An environment is running ESXi/vCenter 7.x and NSX-v 6.4.8 or 6.4.10, a plan is in place to upgrade NSX-v.
VMware recommends to check the vCenter Database for incorrect vmknic values, using the validation steps outlined above before moving forward with the NSX-v upgrade.
If scenario 2 is encountered, please file a support request, if not, please continue with the upgrade.