Duplicate IP alarm when using edge nodes with multiple TEP interfaces
search cancel

Duplicate IP alarm when using edge nodes with multiple TEP interfaces

book

Article ID: 322415

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • The edge node is configured to use multiple TEP interfaces for overlay traffic.
  • Physical networking devices may drop packets as rouge packets on the physical network.
    • That is the physical network appliance has seen a packet with an IP address leave a port on the ESXi host using a different MAC address.
  • The packet which is leaving an ESXi host vmnic has a mac address assigned to the second interface on the edge node, but has the IP address assigned to the first interface on the edge node , or vice versa.
  • For example, the edge node may have 2 TEP interfaces, these are seen as eth1 and eth2 on the ESXi host like this:
[root@esx1:~] net-stats -l | grep edge
33554440 5 9 DvsPortset-0 ##:##:##:##:##:d5 edge1.eth0 - This edge interface has IP address 10.1.1.10
50331658 5 9 DvsPortset-1 ##:##:##:##:##:f8 edge1.eth2 - This edge interface has IP address 192.168.1.12/24
50331659 5 9 DvsPortset-1 ##:##:##:##:##:49 edge1.eth1 - This edge interface has IP address 192.168.1.13/24


If we do a packet capture or review physical security logs, we see packets with the following:

source mac as: ##:##:##:##:##:49 - Mac address from edge node eth1
source IP address as: 192.168.1.12 - IP address from edge node eth2



Environment

VMware NSX-T

Cause

This is a known issue affecting NSX-T edge node traffic egressing the edge node interface.
There is a flow cache process on the edge node, this is used to help increase performance on the edge node.
There is an issue with this process which causes it to choose the wrong interface for egress traffic egressing the edge node.

Resolution

This issue is resolved in NSX-T Data Center 3.0.2, available at VMware Downloads.

Workaround:
Disable flow cache on the edge node, this will ensure packet egress the correct interface.
This will have a performance impact on throughput of the edge nodes.

Log into each edge node as admin and run the following command, to check the current status of flow cache, it should read as Enabled = true:
get dataplane flow-cache config

Then disable flow cache and restart the dataplane service for the change to take affect:
set debug
set dataplane flow-cache disabled
restart service dataplane
The restart service dataplane command may cause a brief interruption to the dataplane in the order of seconds.

This is to check the new status of flow cache, it should read as Enabled = false:
get dataplane flow-cache config