Intermittent Packet loss on VMs on NSX overlay segments
search cancel

Intermittent Packet loss on VMs on NSX overlay segments

book

Article ID: 419283

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Overlay segments connected to a Tier-1 which is connected to a Tier-0 which is peering BGP to the physical network

  • VMs are losing packets when trying to reach out to the physical network or beyond

  • ICMP requests are getting lost (note the gap in sequence numbers):

    ping 10.###.###.1
    64 bytes from 10.###.###.1: icmp_seq=3 ttl=253 time=1.11 ms
    64 bytes from 10.###.###.1: icmp_seq=4 ttl=253 time=1.15 ms
    64 bytes from 10.###.###.1: icmp_seq=5 ttl=253 time=1.20 ms
    64 bytes from 10.###.###.1: icmp_seq=6 ttl=253 time=1.55 ms
    64 bytes from 10.###.###.1: icmp_seq=7 ttl=253 time=1.10 ms
    64 bytes from 10.###.###.1: icmp_seq=8 ttl=253 time=1.36 ms
    64 bytes from 10.###.###.1: icmp_seq=9 ttl=253 time=0.898 ms
    64 bytes from 10.###.###.1: icmp_seq=10 ttl=253 time=1.07 ms
    64 bytes from 10.###.###.1: icmp_seq=11 ttl=253 time=1.22 ms
    64 bytes from 10.###.###.1: icmp_seq=25 ttl=253 time=1.06 ms
    64 bytes from 10.###.###.1: icmp_seq=26 ttl=253 time=0.902 ms
    64 bytes from 10.###.###.1: icmp_seq=27 ttl=253 time=1.03 ms
    64 bytes from 10.###.###.1: icmp_seq=28 ttl=253 time=0.983 ms
    64 bytes from 10.###.###.1: icmp_seq=29 ttl=253 time=0.968 ms
    64 bytes from 10.###.###.1: icmp_seq=30 ttl=253 time=0.995 ms
    64 bytes from 10.###.###.1: icmp_seq=40 ttl=253 time=0.974 ms
    64 bytes from 10.###.###.1: icmp_seq=41 ttl=253 time=0.884 ms
    64 bytes from 10.###.###.1: icmp_seq=42 ttl=253 time=1.10 ms
    64 bytes from 10.###.###.1: icmp_seq=43 ttl=253 time=1.04 ms
    64 bytes from 10.###.###.1: icmp_seq=44 ttl=253 time=1.08 ms
    64 bytes from 10.###.###.1: icmp_seq=45 ttl=253 time=1.05 ms
    64 bytes from 10.###.###.1: icmp_seq=46 ttl=253 time=1.28 ms
    64 bytes from 10.###.###.1: icmp_seq=47 ttl=253 time=1.14 ms
    64 bytes from 10.###.###.1: icmp_seq=48 ttl=253 time=1.08 ms
    --- 10.###.###.1 ping statistics
    52 packets transmitted, 24 received, 53.8462% packet loss, time 51736ms
    rtt min/avg/max/mdev = 0.884/1.091/1.547/0.148 ms

Environment

VMware NSX

Cause

Physical network causes:

  • Routing Loops
  • Asychronous routing
  • MAC flaps

In this particular scenario:

  • Redundant VLAN paths causing duplicate IP behavior ([physical-switch-name] %ARP-2-DUP_SRC_IP: arp [15165] Source address of packet received from <mac-address> on Vlan[###] (port-channel###) is duplicate of local, <ip-address>):

Resolution

Investigate physical network switching fabric for the above-mentioned causes and work with your physical networking team to remove the configurations causing the issue

Additional Information

Packet Capturing for ESXi and NSX Testing and Troubleshooting