You are noticing, especially during peak traffic time, that some DNS queries are taking a long time to resolve. In some case, they time out completely. Communication is able to establish but is taking longer than expected.
DFW packet logs reveal the following drops with unique TCP flags at the end (R=Reset, F=Fin, P=Push, A=Ack)
[ESXi:~] cat /var/run/log/dfwpktlogs.* | grep DROP
2025-12-23T14:32:15.810Z No(13) FIREWALL-PKTLOG[49638402]: 73e2a374 INET tcp strict DROP 1001 IN 52 TCP x.x.x.x/63421->y.y.y.y/53 RA
2025-12-23T14:32:16.398Z No(13) FIREWALL-PKTLOG[49638402]: 73e2a374 INET tcp strict DROP 1001 IN 90 TCP x.x.x.x/65162->y.y.y.y/53 FPA
2025-12-23T14:32:17.068Z No(13) FIREWALL-PKTLOG[49638402]: 73e2a374 INET tcp strict DROP 1001 IN 52 TCP x.x.x.x/61303->y.y.y.y/53 FA
No dropped SYN packets are observed in these logs.
NSX 4.x
vDefend Firewall
DNS over TCP
DFW DNS rules configured with or without L7 Context Profiles
TCP Strict enabled on those DNS policies
There are two policy settings available for DFW policies: TCP State and TCP Strict
TCP Stateful will:
1. Track connections, including TCP's three-way handshake (SYN, SYN-ACK, ACK) and subsequent data transfer.
2. Implicitly allow return traffic through a given rule. This simplifies rule management.
3. Enforce session timers. These timers define how long a session remains active after inactivity; they maintain a cache for traffic flows.
Since the logs do not include TCP flags for the 3-way handshake itself, we can say that this part of the policy configuration is not an issue.
TCP Strict is an add-on to TCP State and is only to stateful TCP policies. It will:
1. Enforce the TCP three-way handshake.
2. Prevent mid-session pick-up.
Mid-session pick-up in the policy refers to the DFW behavior of accepting network sessions that are already established.
What the logs show is that DFW is dropping some packets due to mid-session pick-up after an expired TCP session timer. For example, a TCP packet with flags Reset/Ack (RA) hits the DFW for a session that is no longer being tracked because the DFW timer for established sessions has expired, hence the packet is dropped.
This issue does not apply to DNS over UDP.
DFW session timers can be seen with vsipioctl commands on a per-vNIC basis. The defaults are shown below (in seconds):
[ESX:~] vsipioctl gettimeout -f nic-#######-eth0-vmware-sfw.2
Connection Timeouts:
dfw.tcp.first_packet : 120
dfw.tcp.opening : 30
dfw.tcp.established : 43200
dfw.tcp.closing : 120
dfw.tcp.fin_wait : 45
dfw.tcp.closed : 20
dfw.udp.first_packet : 60
dfw.udp.single : 30
dfw.udp.multiple : 60
dfw.icmp.first_packet : 20
dfw.icmp.error_reply : 10
dfw.other.first_packet: 60
dfw.other.single : 30
dfw.other.multiple : 60
dfw.ip.frag : 30
dfw.interval : 10
dfw.adaptive_start : 200000
dfw.adaptive_end : 270000
dfw.src_node : 0
dfw.ts_diff : 30
dfw.invalid_state : 600
dfw.invalid_l7 : 600
dfw.TBR_revalidation : 120
These are configurable from the UI: Security>General Settings>Firewall>Session Timer
Either decrease DNS idle session time on the server in order to not exceed what is configured in the NSX Session Timer or increase the NSX Session Timer.
References:
https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/vdefend/vdefend-firewall/4-2/vdefend-distributed-firewall/configuring-distributed-firewall/firewall-general-settings/create-a-session-timer.html
https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/vdefend/vdefend-firewall/4-2/vdefend-distributed-firewall/managing-distributed-firewall-across-multiple-locations/create-dfw-rules-from-gm.html