This issue is prevalent in NSX-T versions below 2.5.2, and the symptoms may manifest in different forms. Here are some examples.
All the TCP/UDP traffic gets dropped at Tier-0/Tier-1
or
NSX-T Load balancer stops processing all the traffic
or
The HA state on the Edge shows as Unknown.
ICMP traffic works fine
Issue is seen only after traffic burst
Upgrade the NSX-T to 2.5.2 and above.
NSX-T 3.2.0 and above has some additional enhancements.
Workaround:
- On the live setup you see "TCP Half Opened" is at the Max value of 4294967295 for some or all the interfaces.
root@edge-x:~# su admin -c get firewall interfaces | grep Interface
Wed Feb 23 xxxx UTC xx:xx:xx:xx
Interface : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx
root@edge-1:~# edge-appctl -t /var/run/vmware/edge/dpd.ctl fw/get_sessioncount xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx
{"TCP Half Opened":4294967295,"UDP Active":0,"ICMP Active":0,"Other Active":0,"TCP Half Opened MAX":1000000,"UDP Active MAX":100000,"ICMP Active MAX":10000,"Other Active MAX":10000}
NOTE: In newer version the command "edge-appctl -t /var/run/vmware/edge/dpd.ctl fw/get_sessioncount xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx" may not work, instead make use of the command "edge-appctl -t /var/run/vmware/edge/dpd.ctl fw/show fw-lr-connections | python -mjson.tool"
To clear the "TCP Half Opened Active/Max" entries
- Reboot the Edge
or
- Restart the Edge Data-plane
restart service dataplane