Packet Drops on Edge Virtual Machine (EVM) is causing application issues
search cancel

Packet Drops on Edge Virtual Machine (EVM) is causing application issues

book

Article ID: 393652

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Applications are experiencing connectivity issues such as:
    • Initial connections are not getting established
    • Established connections are getting disconnected/dropped seemingly at random
    • Symptoms affect only some virtual machines on the same subnet/segment, while others can communicate without issues
    • Symptoms affect virtual machines distributed across different subnets/segments, while others on those same subnets/segments can communicate without issues
  • Tier-0 High Availability (HA) Mode is configured as Active/Active with Stateful Services off:
  • Tier-0 Gateway Firewall has Stateful rules configured:
  • Tier-0 BGP configuration has ECMP enabled:

Environment

VMware NSX

Cause

Tier-0 Gateways with this configuration can introduce asymmetrical routing. Outbound traffic may exit Edge Node 1, while incoming traffic may enter Edge Node 2. With stateful firewall rules enabled at the Tier-0, this means each Edge is intended to maintain its own state table for traffic entering and exiting that Edge. In the case of asymmetrical routing, a state is being recorded for the outbound traffic leaving Edge Node 1, but the returning traffic is entering Edge Node 2 and being challenged by the firewall rules. Due to the state being held on Edge Node 1, Edge Node 2 is dropping the return traffic per the firewall rules' design, even if it is an Allow rule.

For example: The state of a TCP connection through a firewall is determined by the SYN packet. In this configuration, the SYN leaves Edge Node 1, which creates a state and allows the packet out of the Edge. The returning SYN-ACK packet arrives at Edge Node 2 and is dropped by the firewall since Edge Node 2 recorded no SYN leaving and therefore has no state for the connection.

Resolution

Options:

  • Disable Tier-0 stateful firewall
  • Disable all stateful firewall rules
  • Enable Stateful Services at the Tier-0 if your network design can support it.

Additional Information