Fragmented ICMP echo requests are occasionally dropped by Edge uplink interface
search cancel

Fragmented ICMP echo requests are occasionally dropped by Edge uplink interface

book

Article ID: 426674

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • You are doing a ping to an Edge uplink interface, with ping set to size higher than the MTU on the datapath. 
  • Some if the ICMP echo requests are dropped by the Edge's uplink interface. 
  • Statistics on the uplink interface indicate increasing counters "rx_frag_error" and "rx_frag_timeout":
                    "ifuuid": "########-####-####-####-########3b28",
                    "ifuid": 869,
                    "type": "lif",
                    "ptype": "uplink",
                    "punt-policy": "SRC",
                    "name": "#############",
                    "internal_name": "uplink-869",
                    ..
                    "stats": {
                        "rx_pkts": 148871318685,
                        "rx_bytes": 125325683200160,
                        ..
                        "rx_frag_ok": 92048358,
                        "rx_frag_error": 146916639,          <<< this counter keeps increasing
                        "rx_frag_timeout": 236546724,        <<< this counter keeps increasing
  • ECMP hashing on the routing device is set to 5-tuple (Source IP address, Destination IP address, Source Port, Destination Port, Protocol).

Environment

VMware NSX
VMware NSX-T Data Center

Cause

  • As the Gateway Firewall on the Tier-0 gateway is enabled, Edge needs to inspect every packet. If fragmentation error is detected, the packet is dropped. 
  • A high number of fragmented packets (to be inspected by the Gateway Firewall) can exhaust the Edge, leading to occasional drops.

Resolution

This is a condition that may occur in a VMware NSX environment.

There are two approaches to mitigate this behaviour:

  1. If the Gateway Firewall is enabled on the Tier-0 instance, disable the Gateway Firewall.
    Otherwise all fragmented packets will have to be inspected. If some fragments are missing, and/or delayed, these fragments will be dropped. 

  2. Reconfigure the ECMP hashing on the routing device to: 
    If fragmented traffic,
        Then use 3-tuple hash (Source IP address, Destination IP address, Protocol).
    Else, use 5-tuple hash (on non-fragmented traffic). 
    With 3-tuple hash on fragmented packets, these (fragmented) packets will always belong to the same IP flow, which will go to the same Edge node, and this Edge will be able to reassemble all fragments and forward them out. 

Additional Information

Also see KB article: NSX-T Edge ECMP behavior.