Intermittent 502 Bad Gateway response codes generated by NSX Load Balancer module for hosted virtual-server
search cancel

Intermittent 502 Bad Gateway response codes generated by NSX Load Balancer module for hosted virtual-server

book

Article ID: 322035

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

To let the customers aware about the known observation with traffic flowing through NSX Load Balancer(LB) in high traffic volume conditions when using One-Arm-LB/Pool-member-connected via service interface + single pool member + auto-map

  1. There are high-traffic volume for NSX-T LB Virtual-Server for which you are are facing intermittent issue.

  2. In case the virtual-server is hosting http/https service, the client should be observing intermittent HTTP/502 response code coming from virtual-server.

  3. The SNAT settings currently being used on this NSX-T LB hosted virtual-server is "Auto-map", meaning the LB is going to use the single interface IP (interface towards the pool member) to generate server-side connections.

  4. The server-side connections generated from the NSX-T LB is exiting via an "uplink" or "service" type interface of the T1-Gateway (GW) associated with the LB. Ex: One-arm LB setup or backend pool-member is connected via service interface over a VLAN segment.

  5. On the NSX-T edge "syslog" in case "Access-log" is enabled for the respective virtual-server you may be seeing the HTTP/502 error code responding to client (incase the virtual-server is http/https hosted)

    2024-01-30T09:08:44.014438+00:00 Edge-##.corp.local NSX 24387 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="access" level="INFO"] [########-####-####-####-############][########-####-####-####-############] Operation.Category: 'LbAccessLog', Operation.Type: 'Http', Lb.UUID: '########-####-####-####-############', Lb.Name: 'NSX-LB-##', Vs.UUID: '########-####-####-####-############', Vs.Name: 'Test-VIP-HTTPS-443', Vs.Ip: '###.###.###.###', Vs.Port: '443', Pool.UUID: '########-####-####-####-############', Pool.Name: 'Test-Pool-HTTPS', PoolMember.Ip: '###.###.###.###', PoolMember.Port: '443', Client.Ip: '###.###.###.###', Client.Port: '12502', Snat.Ip: '###.###.###.###', Snat.Port: '63069', HttpRequest.Method: 'POST', HttpRequest.UserAgent: 'Mozilla/5.0 (Linux; Android 8.0.0; Pixel Build/OPR3.170623.008) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.98 Mobile Safari/537.36', HttpRequest.X-Fwd-For: '-', HttpRequest.Uri: '/auth/access_token', HttpRequest.Host: 'auth.testsite.corp.local', HttpResponse.Status: '502', HttpResponse.StatusCategory: '5xx', HttpResponse.Size: '0', HttpResponse.ServerTime: '31.624', HttpResponse.TotalTime: '31.628', Error.Reason: '-'

  6. Or you may be seeing that the NSX-T LB is reporting that the backend pool-member have "prematurely closed" the server-side connection. Hence NSX-T LB have sent HTTP/502 response code towards client

    2024-01-30T09:08:44.014438+00:00 Edge-##.corp.local NSX 17005 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="lb" level="ERROR" errorCode="EDG9999999"] [########-####-####-####-############] upstream prematurely closed connection while reading response header from upstream, client: ###.###.###.###, server: , request: "POST /auth/access_token HTTP/1.1", upstream: "https://###.###.###.###:443/auth/access_token", host: "test.####.local"

     

  7. If you are performing simultaneous packet-capture on both side of the server-side connection, i.e. On the (NSX-T T1 interface towards pool member or NSX-T edge fp-eth interfaces) and (Pool member side), you would be seeing that the NSX-T edge T1 interface IP/Auto-map IP is sending ICMP Type:3 (Destination Unreachable) Code: 3 (Port Unreachable) message towards pool-member IP in response to a SYN-ACK packet sent by the pool-member informing that the TCP SRC-Port used is unreachable. Refer below,

    09:08:44.122948 IP 10.###.###.1.4000 > 10.###.###.3.443: Flags [S], seq 2921455552, win 64240, options [mss 1460,sackOK,TS val 3495508048 ecr 0,nop,wscale 8], length 0 09:08:44.122985 IP 10.###.###.3.443 > 10.###.###.1.4000: Flags [S.], seq 4187293146, ack 2921455553, win 65160, options [mss 1460,sackOK,TS val 1416696892 ecr 3495476677,nop,wscale 7], length 0 09:08:44.124528 IP 10.###.###.1 > 10.###.###.3: ICMP 100.###.###.1 tcp port 4000 unreachable, length 36

  8. At the same time on the Gateway firewall connection table on the edge node for the respective Uplink/service interfaces through which the aforementioned server-side connection is flowing, we would be seeing a connection using the same SRC-IP+SRC-PORT+DST-IP+DST-PORT for which we have seen the ICMP Port Unreachable, is present in CLOSING-CLOSED state.

    Ex:0x0000004278001207: 10.###.###.3:443 -> 10.###.###.10.###.###.10.###.###.10.###.###.:4000  dir in protocol tcp state CLOSING:CLOSED fn 1002:0

Environment

  • VMware NSX

Cause

  1. To understand the cause you would need to understand the following situation:

    1. When backend server-side connection is flowing via a Uplink or service interface on the T1 GW associated with the NSX-T LB , then that connection is also passing via the GW firewall module associated with the respective uplink/service interface.

    2. When you have a single backend pool-member IP with "Auto-map" configuration which uses single IP of the respective T1 GW interface, the parameter that we will see different in all the server-side connection is "SRC-Port". Parameters for all the server-side connection towards a pool-member are as follows:

    3. Source IP = Auto-map IP (constant )

    4. Destination IP = Pool-member IP (constant)

    5. Destination Port = Port mapped with service on pool-member (constant)

    6. Source Port = Randomly selected SNAT port (variable)

    7. So 3 parameters are constant in all the server-side connection for this virtual-server towards a pool-member.

    8. Hence, In cases of high traffic volume on this virtual-server, the LB needs to frequently re-use the source-port of an older server-side connection (which is removed) for a new server-side connection.

  2. Now lets consider the following scenario:



  3. At Time-1 for a client request NSX-T LB used a Source-Port (ex: 4000) for its server-side connection for the respective virtual-server (10.202.2.1.4000 > 10.202.2.3.443). After that request is done, NSX-T edge removes that connection from the Gateway Firewall (GW FW) connection table. This port is now free to be used in another server-side connection towards the same pool-member.

    1. At Time-2 for another client request, NSX-T LB decided to use this same source port (ex:4000) towards that pool-member. Hence sent a SYN packet towards the pool member (10.202.2.1.4000 > 10.202.2.3.443: Flags [S])

    2. By the time the pool-member responds for this new SYN packet, at Time-3, a delayed/retransmitted TCP-FIN packet for the older connection of Time-1 is sent by the pool-member towards the LB. This can happen due to high congestion in the backend network or slow responding pool member conditions etc. Upon receiving this delayed FIN packet for the older connection, the T1 GW, which received this packet via its Uplink/Service interface which have GW FW module enabled, creates a connection table entry in the GW FW module in CLOSING-CLOSED State (as it is FIN packet).

    3. At Time-3:

      0x0000004278001207: 10.###.###.3:443 -> 10.###.###.1:4000  dir in protocol tcp state CLOSING:CLOSED fn 1002:0

    4. By now, at Time-4, for the new connection for which NSX-T LB had sent SYN packet at Time-1, pool-member have sent the SYN-ACK response. When this SYN-ACK packet gets received on the T1 GW Uplink/Service interface this packet gets inspected by the GW FW module associated with the interface. Since it sees that it is matching the existing connection-table entry created at Time-3 whose state is CLOSING-CLOSED state, the FW module drops this packet and sends an ICMP Type:3+Code:3 response towards the pool member informing the pool-member that this source port is unreachable. NSX-T LB at this stage is still unaware that this SYN-ACK packet has been dropped by the FW module and still expects this SYN-ACK to continue further with TCP Handshake for new connection and keeps on re-transmitting the SYN packet.

    5. At Time-4 ---------

      09:08:44.122985 IP 10.###.###.3.443 > 10.###.###.1.4000: Flags [S.], seq 4187293146, ack 2921455553, win 65160, options [mss 1460,sackOK,TS val 1416696892 ecr 3495476677,nop,wscale 7], length 0 09:08:44.124528 IP 10.###.###.1 > 10.###.###.3: ICMP 10.###.###.1 tcp port 4000 unreachable, length 36 09:08:44.129938    10.###.###.1 > 10.###.###.3 TCP 78 [TCP Retransmission] 4000 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM TSval=2088599461 TSecr=0 WS=256

    6. The pool-member unaware about this situation keep on re-transmitting the SYN-ACK packet couple of times, which are dropped again and again by the FW module, finally terminates that connection with LB on this port.

    7. Continuation of Time-4

      09:08:44.140640    10.###.###.3 > 10.###.###.1 TCP 74 [TCP Retransmission] 443 → 4000 [SYN, ACK] Seq=0 Ack=1 Win=64308 Len=0 MSS=1410 SACK_PERM TSval=2122155525 TSecr=2088598450 WS=128 09:08:44.140646    10.###.###.1 > 10.###.###.3 ICMP 70 Destination unreachable (Port unreachable) 09:08:44.146635    10.###.###.3 > 10.###.###.1 TCP 74 [TCP Retransmission] 443 → 4000 [SYN, ACK] Seq=0 Ack=1 Win=64308 Len=0 MSS=1410 SACK_PERM TSval=2122156531 TSecr=2088598450 WS=128


    8. After multiple retransmission of SYN-ACKs when finally pool-member terminates this new connection (RESET) and this RESET came to NSX-T LB module, it informs the end client about the "prematurely closed" connection by the backend-server/pool-member with HTTP/502 (Incase the virtual-server is hosting HTTP/HTTPS traffic).
      1. As explained above, from individual component (GW FW and NSX-T LB module) point of view, both of them performed their respective logical operations, as follows:

      2. GW FW module created a connection table entry upon receiving a delayed FIN packet for older connection and was matching the newer connection's SYN-ACK packets with this connection table entry as SRCIP+SRCPORT+DSTIP+DSTPORT all are identical between the old and new connection.

      3. NSX-T LB module never received the SYN-ACK packet for the newer connection. Upon receiving a final RESET from pool-member it informed the end client about the "prematurely closed" connection by pool-member.

Resolution

The resolution lies in the fine-tuning of the configuration of NSX-T LB/virtual-server or Gateway Firewall connection timers, or both to reduce the probability of having the above situation. If the issue can be recovered after fine-tuning the LB/virtual-server configuration then that is preferred without changing the Gateway Firewall timers.

Note: Modifying the GW FW timers can have implications over traffic flowing through the respective T1 GW's firewall module, as these timers are applied on GW FW level.

NSX-T LB/VIP fine-tuning:

  1. To increase the combination of server-side traffic pattern, instead of using "Automap", an SNAT-Pool should be used containing multiple IPs.
  2. TCP Multiplexing should be used so that NSX-T LB can leverage existing server-side connections for newer requests.

NSX-T Gateway Firewall Module timers fine-tuning:

  1. Configure "Security Profiles" associated with the Tier1 or Tier0 GW firewall with a shorter "First Packet" timer based upon the needs and observations from the environment. Refer: Default Session Timer Values and Create a Session Timer

Additional Information

Impact/Risks:
Intermittent packet-drop/client connection issues over the virtual-server hosted on NSX-T LB