Pool members on a different subnet to the virtual server IP never come UP on a newly deployed NSX native load balancer
book
Article ID: 420321
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Virtual Server is in a DOWN state since the pool members are not in an UP state.
Load balancer always shows as degraded
The virtual server IP address is on a different subnet to the pool members.
This is a new build environment and the pool members have never been up before.
There is a HTTP, HTTPS or TCP active health monitor on the virtual server.
In the /var/log/syslog of the active LB Edge logs similar to the following are reported:
NSX 3748888 LB [nsx@6876 comp="nsx-edge" subcomp="nsx-edge-lb.lb" level="ERROR"] "ngx stats pool uuid is not found in config: {"uuid": "<UUID>", "status": "down", "members": [{"id": "<POOL MEMBER IP ADDRESS:PORT>", "status": "down", "last_change_time": "1763939127218", "monitors": [{"uuid": "<MONITOR UUID>", "status": "down", "last_check_time": "1764091498751", "last_change_time": "1763939127218", "failure_code": "24361", "failure_reason": "TCP Handshake Timeout"}]}, {"id": "POOL MEMBER IP ADDRESS:PORT", "status": "down", "last_change_time": "1763939127218", "monitors": [{"uuid": "<MONITOR UUID>", "status": "down", "last_check_time": "1764091498751", "last_change_time": "1763936776174", "failure_code": "24361", "failure_reason": "TCP Handshake Timeout"}]}]}"
The virtual server is configured with the default SNAT translation mode of 'Automap Mode'.
Packet captures gathered on the Edge uplink vnic validate that TCP packets to the pool members are egressing with a source IP address in the 100.64.0.X/31 subnet. Please note that this is the expected behaviour with 'Automap Mode'. An example packet capture command would be the following:
pktcap-uw --switchport <PORT ID> --capture VnicRx,VnicTx --rcf "geneve and host <IP ADDRESS OF POOL MEMBER> and port <TCP PORT configured on the health monitor>" -o - | tcpdump-uw -ner -
The same packet capture on the pool member vnic validate that the packets are not being delivered to it.
Environment
VMware NSX
Cause
Since the Virtual Server IP is on a different subnet to the destination pool members, the packets will be transmitted from the Edge with a NATted source IP, which will be the VS IP. Because of this the packets will be forwarded to a gateway\router to be routed.
If the required routes are not configured on that router then they will be dropped.
Resolution
Confirm with the administrator of the router, that routes are configured to the destination pool member subnet and routes back to the subnet being used for the source NAT.
If a route has been configured which expects the packets to be source NATted from the virtual server IP, then reconfigure the server pool to use the SNAT Translation Mode 'IP Pool'.
As an example if your virtual server IP is configured as 10.10.10.10, then configure your pool the following way:
NSX Native Load Balancer Administration Guide. Please note the following regarding SNAT translation mode 'Automap' using a service router uplink or service interface as the SNAT source: