NSX Gateway Firewall on T0/T1 Causing TCP Drops with Load Balancer
search cancel

NSX Gateway Firewall on T0/T1 Causing TCP Drops with Load Balancer

book

Article ID: 381564

calendar_today

Updated On:

Products

VMware NSX VMware NSX-T Data Center

Issue/Introduction

  • When the client (source) tries to access the application (hosted on pool server) behind the load balancer, intermittently, the requests fail.

  • Packet capture on the load balancer service interface shows TCP port being reused for the dropped connection:



  • The connection does not go through (the client (the source IP in the above screenshot) never receives a SYN-ACK from the load balancer VIP (the destination IP in the above screenshot) and therefore, client retransmits, eventually resulting in a failed connection.

Environment

VMware NSX
VMware NSX-T Data Center

Cause

Gateway Firewall drops the new connection because, there is a still a half-open TCP connection with the same 5-tuple (i.e. protocol number, source address, destination address, source port, and destination port).

  • Below are the packets we see on the load balancer service interface for the new connection. Client sends a SYN to the LB VIP and as the VIP does not respond with a SYN-ACK, client retransmits the SYN and eventually, the connection fails to establish the TCP handshake:



  • Minutes before this new connection, there was another connection request with the same 5-tuple and it was successfully established. However, at the end, we only see the LB VIP sending the FIN-ACK. We don't see any FIN-ACK from the client. Therefore, the gateway firewall treats this connection as half-open for the following 15minutes. Therefore, within the following 15minutes, if there is a new connection request with the same 5-tuple, gateway firewall will drop the new connection request. Below is the capture of the previous half-open connection showing there is no FIN-ACK received from the client:

Resolution

If the client needs to aggressively re-use the TCP ports (with the same 5-tuple) and we are having a situation where the client is not cleanly closing the connection (sending a FIN-ACK to the LB VIP), we have two workarounds:

  • If there are no gateway firewall rules configured, disable gateway firewall so that, it does not monitor for any half-open connections.
  • From Create a session timer, create a new session timer profile with the TCP Closing timer adjusted to how fast the client is expected to re-use the TCP ports. For example, if this is adjusted to 2minutes (instead of the default 15minutes), firewall will purge the half-open connection after 2minutes and therefore, any new connection with the same 5-tuple will be allowed by the firewall.