Northbound connectivity through Hardware VTEP lost on VMs after vMotion
search cancel

Northbound connectivity through Hardware VTEP lost on VMs after vMotion

book

Article ID: 327300

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • Virtual Machines lose Northbound connectivity after vMotion (North/South).
  • Affected North/South VM traffic is flowing through Hardware VTEP.
  • East/West traffic is not affected.
  • NSX Controller ToR Agent logs (show log cloudnet/toragent.log) show errors similar to:
2019-06-14 12:40:32,689 | DEBUG | nioEventLoopGroup-3-1 | JsonRpcEndpoint           | Response : {"id":"34c38828-5a11-4310-bf0a-dd15194cb5cc","result":[{"count":1},{},{"details":"Table Ucast_Macs_Remote column locator row d2e2f641-b925-42d7-90a8-6a875d75eb52 references nonexistent row 22054b92-7c1c-4622-a21c-d81dcebb77d3 in table Physical_Locator.","error":"referential integrity violation"}],"error":null}
2019-06-14 12:40:32,690 | ERROR | pool-6-thread-9 | TorInorderMessageProcessor | Error when processing a QueueElement
com.vmware.toragent.tormgr.util.TransactionException: ExceptionMessage: Error updating rowOperationsRequested: 2OperationsExecuted: 3
        at com.vmware.toragent.tormgr.southbound.TorClient.executeTransaction(TorClient.java:2012)
        at com.vmware.toragent.tormgr.southbound.TorClient.executeTransaction(TorClient.java:1984)
        at com.vmware.toragent.tormgr.southbound.UcastMacEvent.updateUcastMacRemoteRow(UcastMacEvent.java:132)
        at com.vmware.toragent.tormgr.southbound.UcastMacEvent.processUpdate(UcastMacEvent.java:73)
        at com.vmware.toragent.tormgr.southbound.SouthboundEvent.processElement(SouthboundEvent.java:79)
        at com.vmware.toragent.tormgr.lib.TorInorderMessageProcessor$TorInorderQueueElement.processElement(TorInorderMessageProcessor.java:261)
        at com.vmware.toragent.tormgr.lib.QueueElementRunner.run(TorQueueElementProcessor.java:78)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX for vSphere 6.4.x
VMware NSX for vSphere 6.3.x

Cause

During normal operations, NSX Controllers should update the ToR tables with the new VTEP IP address where the VM now resides.
In this issue, due to an inconsistency between the NSX Controller and the ToR OSVDB, the NSX Controller try to update the VTEP IP for the VM MAC address into a non-existent table in the ToR OVSDB. As the update fails, the VTEP information for the affected Virtual Machine is no longer valid (old entry is retained). This will affect North/South connectivity via the Hardware VTEP for this VM MAC address.

Resolution

This issue is resolved in NSX for vSphere 6.4.2.

Workaround:
To workaround the issue, if you are unable to upgrade:
  • Detach the NSX Controller from the affected ToR and re-attach the NSX Controller.
Note: As this will impact all network which go through the ToR, the operation should be performed during a maintenance window.

Additional Information

Impact/Risks:
This issue affects Virtual Machines connected to Logical Switches bound to the ToR that the NSX Controller is unable to update.
If there are multiple hardware VTEP's, only the Virtual Machines connected to Logical Switches bound to the affected ToR may be affected, after vMotion.