TCP retransmission is observed during vMotion of time sensitive database VMs. These database (DB) VMs hosts applications that are high latency sensitive.
Such symptoms are experienced under below condition.
1. Host BFD tunnels to NSX Edge are in different subnet.
2. This is the first VM that onboards to a host/vMotion to the host.
VMware NSX
VMware NSX Data Center
When vMotion occurs to a host that doesn't have any workload (no BFD tunnels UP), it starts sending the traffic out. This traffic doesn't have path until the BFD tunnels comes UP.
Edge usually is located in different subnet than host and it uses MTEP (hierarchical two tier replication) replication. For every remote L2 domain, the source Transport Node will elect a remote MTEP and will forward the BUM traffic to each MTEP in each remote L2 domain.
Usually host elects an MTEP based on BFD sessions state being Up or Down to the remote vteps in that subnet. One remote vtep would be elected as an MTEP randomly whose BFD session state is UP for that subnet. If there is no remote hosts or edges to which BFD sessions is UP on that subnet then hierarchical two tier replication cannot happen from host.
BFD tunnels to TEPs in different subnet (the subnet where Edges are present) can take up to 2-3 seconds after session creation.
Version where this is fixed : NSX 4.2.1 release
The fix for this issue comes with enhancement of the MTEP election algorithm to choose a random MTEP if BFD sessions are coming up. This way latency reduction between 1-1.5 seconds is expected when the first VM vMotions to the host.
On comparison with vMotion latency when the tunnels are already established on the destination host,
Expected Latency before NSX 4.2.1,
Latency = MAX("2 seconds for tunnels to come up", "< 500 msec for Logical Switch/Routing Domain span to be pushed down from Controller") = ~2 seconds
Expected Latency with NSX 4.2.1,
Latency = MAX(0, "< 500 msec for Logical Switch/Routing Domain span to be pushed down from Controller") = ~500 msec.
Workaround :
Workaround is to have a dummy VM on the host for the BFD tunnels to remain UP.