In an NSX-T environment cloned VMs are unable to connect to the network
book
Article ID: 317802
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms: In an NSX-T environment the following symptoms are observed:
A newly deployed VM cannot connect to the network.
The VM deployment mechanism involved a clone operation. This could be a manual deployment or from an automated provisioning platform/tool such as vRealize Automation.
The following errors are seen in the ESXi logs:
hostd.log: 2019-01-21T17:41:21.912Z info hostd[11D81B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 118 : Error message on vm on esxi in ha-datacenter: Failed to connect virtual device 'Ethernet0'.
vmware.log: 2019-01-21T17:41:21.769Z| vmx| I125: VMXNET3 user: failed to connect Ethernet0 to network 0d95a802-1dc8-4434-####-###########. 2019-01-21T17:41:21.769Z| vmx| I125: [msg.device.badconnect] Failed to connect virtual device 'Ethernet0'.
vmkernel.log 2019-01-21T17:41:21:52.209Z cpu20:683363 opID=55b9b200)NetPort: 3203: blocking traffic on DV port ad9ecc75-87a8-4997-####-########### 2019-01-21T17:41:21:41.326Z cpu218:696440)NetPort: 3203: blocking traffic on DV port ad9ecc75-87a8-4997-####-###########
net-dvs output on the ESXi shows the port as blocked
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Environment
VMware NSX-T Data Center
Cause
The vmx file, which is the configuration file of a VM, contains information about how that VM connects to the network. In an NSX-T environment you will see entries similar to this sample output:
“opaqueNetwork.id” is the UUID of the NSX logical switch “externalId” is the UUID of the VIF that the VM is using to connect to the logical switch
Whenever a VM connects to an NSX logical switch, the VIF and switch ID are stored in the vmx file. Each VM vNIC should have a unique VIF. There is a known issue in vSphere that results in a VM’s VIF not being removed during vNIC edits or clone operations.
This results in the new VM using the same VIF in its vmx file as an existing VM.
When VMs with the same externalId/VIF connect to the logical switch, the switch will think it is the same VM and attempt to connect them to the same switchport. Only one vNIC can be connected to a switchport and therefore only one VM will be able to connect.
Resolution
The issue is resolved in:
ESX 6.5, Patch ESXi650-201811002
ESXi 6.7 Patch ESXi670-201901001
Workaround: To workaround the problem:
Before using any VM or VM template as a source for cloning:
Power off the VM.
Edit the vmx config file.
Delete the line “ethernetN.externalId”, where N is the vNIC number.
This will prevent any cloned VMs getting the same VIF.
For existing VMs that are having the problem:
Power off the VM.
Edit its vmx file and delete the line “ethernetN.externalId”, where N is the vNIC number.
Note: it is safe to remove all the lines that have “ethernetN.externalId” when a VM is powered off. When the VM is powered on again, a new unique VIF will be created for any vNIC that connects to a NSX logical switch. The new VIF will then be written back to the vmx file.
Additional Information
The same behavior will occur if any third party software which is copying a vmx file during a clone or backup does not remove pre-existing “ethernetX.externalId” information, which is unique to the original VM.