NSX T Load Balanced Pool Member Becomes Inaccessible Randomly
search cancel

NSX T Load Balanced Pool Member Becomes Inaccessible Randomly

book

Article ID: 316673

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

To identify one cause of intermittent unavailability of NSX T Load Balanced pool members.

Symptoms:

- Intermittently, access to load balanced pool members is unavailable.
- The load balancer generally services a high rate of traffic.
- Within the /var/log/lb/<Loadbalancer-UUID>/logs/error.log of the Edge Node hosting the active instance of the T1 Service Router that the effected load balancer is attached to, you observe the following error:

2020/12/16 20:15:04 [debug] 9744#0: SNAT: alloc SNAT from Pool: nat_1681932289_1
2020/12/16 20:15:04 [debug] 9744#0: SNAT: alloc poot from IP: nat4_100.64.64.1, next port: 65535.
2020/12/16 20:15:04 [error] 9744#0: SNAT: alloc SNAT from Pool: nat_1681932289_1 failed
2020/12/16 20:15:04 [error] 9744#0: l4lb failed to get snat resource

*** NOTE: exact date, time, IP's, and NAT pool values are examples.

- The next port to be allocated is 65535, which is the highest TCP port number. 


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center
VMware NSX-T Data Center 2.5.x

Cause

This error message indicates that all available SNAT Source Ports are in use by an existing TCP session.  Subsequent connection attempts will experience a timeout until a TCP source port is available once more. 

This symptom is more frequent in NSX T versions prior to: 2.5.3, 3.0.3 , 3.1.3, and 3.2.0.

The algorithm which allocates NSX T TCP source ports has been optimized to make occurrences of this issue less likely at the above versions and subsequent releases after those versions.   It is important to note that while the algorithm has been optimized, high traffic volume can still lead to TCP source port saturation.

Resolution

The behavior above is a result of the amount of traffic traversing a load balancer related to the amount of TCP source ports available for use.  By default, NSX T Server Pools use "SNAT Automap Mode", which provides one IP address to facilitate all SNAT connections to backend LB pool members.  If this issue is encountered, the recommended resolution is to increase the amount of IP addresses used for those SNAT connections, which in turn will increase the amount of available TCP source ports for SNAT.  This can be done by changing the Server Pool SNAT method to "SNAT IP Pool" and providing a range of multiple IP addresses to SNAT from.  Any IP addresses may be used in the SNAT IP Pool as long as they are not duplicates and there is no firewall preventing that IP address from reaching the backend Server's within the pool over the specified port. 

Additional information on SNAT Allocation modes can be found in the NSX T Admin Guide, found here: https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/administration/GUID-626ED203-8D24-4053-BA74-1912C774F9AC.html