Packets are intermittently dropped after a connection is established toward vip on a load balancer.
VMware NSX-T Data Center 3.x
VMware NSX
This issue is caused as the connection between snat ip address and back-end server ip address matches any-any drop rule unexpectedly where there are changes such as registering/modifying firewall rules.
The following is an example regarding what can happen and details of what you can see on an edge node.
> get firewall <interface uuid> ruleset rules
Firewall rule count: 4
Rule ID : 1024
Rule : inout protocol any stateless from ip <client ip address> to ip <unused ip address> interface uuid <interface uuid> drop
Rule ID : 1022
Rule : inout protocol any from any to ip <virtual server ip address> interface uuid <interface uuid> accept
Rule ID : 1021
Rule : inout protocol any stateless from any to any interface uuid <interface uuid> drop
Rule ID : 1020
Rule : inout protocol any from any to any accept
The firewall rule id is set 0 for the connection between snat ip address and back-end server ip address before issue happens.
> get firewall <interface uuid> connection
<connection id1>: <client ip address>:64446 -> <virtual server ip address>:80 dir in protocol tcp state ESTABLISHED:ESTABLISHED fn 1022:0
<connection id2>: <snat ip address>:4102 -> <back-end server ip address>:80 (<snat ip address>:4102) dir out protocol tcp state ESTABLISHED:ESTABLISHED fn 0:3
The firewall rule id can change to 1021 for the same connection unexpectedly when you add an additional firewall rule such as firewall rule id 1024. The firewall rule id 1021 matches any-any drop rule and it can lead to lose network connectivity.
> get firewall <interface uuid> connection
<connection id1>: <client ip address>:64446 -> <virtual server ip address>:80 dir in protocol tcp state ESTABLISHED:ESTABLISHED fn 1022:0
<connection id2>: <snat ip address>:4102 !-> <back-end server ip address>:80 (<snat ip address>:4102) dir out protocol tcp state ESTABLISHED:ESTABLISHED fn 1021:3
This issue is resolved in VMware NSX 3.2.4
This issue is resolved in VMware NSX 4.2.0
Workaround:
You can add an allow firewall rule that has snat ip address as source and back-end server ip address as destination above the any-any drop rule.
Source : <snat ip address>
Destination : <back-end server ip address>
Services : <service port>
Applied to : Service Interface
Action : Allow