Failed expected state" due to Half-Closed Connections in NSX Firewall.Connect timed out' failures.FIN_WAIT state because the client is not sending a TCP FIN packet to actively close the connection. Under the current Firewall framework, the timeout for a half-closed TCP connection is 900 seconds. Refer Default Session Timer ValuesEdge> get firewall <T1_SR_Uplink_Interface> connection | find <Scheduler_App_IP>
172.##.##.20:48378 -> 172.##.##.25:3306 dir in protocol tcp state ESTABLISHED:ESTABLISHED f-2060 n-0
172.##.##.22:40654 -> 172.##.##.24:3306 dir in protocol tcp state ESTABLISHED:ESTABLISHED f-2060 n-0
100.##.##.3:25342 (172.##.##.23:34420) -> 172.##.##.13:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 56
100.##.##.3:25436 (172.##.##.23:52728) -> 172.##.##.11:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 97
100.##.##.3:25529 (172.##.##.23:42440) -> 172.##.##.13:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 97
100.##.##.3:25600 (172.##.##.23:60564) -> 172.##.##.11:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 97
100.##.##.3:25590 (172.##.##.23:60580) -> 172.##.##.13:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 97
100.##.##.3:25507 (172.##.##.23:60594) -> 172.##.##.12:443 (10.##.##.129:443) dir in protocol tcp state ESTABLISHED:FIN_WAIT_2 f-2060 n-0 expire 97
Edge> get firewall <T1_SR_Uplink_Interface> ruleset rules
<output omitted for brevity>
Firewall rule count: 1
Rule ID : 2060
Rule : inout protocol any from any to any accept
TCP FIN packet.13:55:56.144240 IP 172.##.##.2.43106 > 10.##.##.129.https: Flags [S], seq 3763703824, win 64240, options [mss 1460,sackOK,TS val 2926178674 ecr 0,nop,wscale 7], length 0
13:55:56.188793 IP 172.##.##.2.43114 > 10.##.##.129.https: Flags [S], seq 110398818, win 64240, options [mss 1460,sackOK,TS val 2926178718 ecr 0,nop,wscale 7], length 0
get firewall <T1_SR_Uplink_Interface> interface stats command, collected at different time intervals, show a consistent increase in packet drops. The simultaneous increment of both the 'Input packets dropped' and 'Failed expected state' counters confirms that the packets are indeed being dropped due to 'Failed expected state'.Edge> get firewall <T1_SR_Uplink_Interface> interface stats
<output omitted for brevity>
Connections per second : 93
Drop by IPsec policy : 0
Drop by LB : 0
.
.
Failed NAT connection limit : 0
Failed NAT translation : 0
.
.
Failed expected state : 26319 <======================
.
.
Input bytes allowed : 31956750139598
Input bytes dropped : 1638102
Input dropped packets copied : 12630921
Input encrypted packets : 0
Input fastforwarded : 14978202844
Input fragments dequeued : 0
Input fragments queued : 0
Input fragments released : 0
Input of inactive context : 0
Input packets allowed : 31042192627
Input packets dropped : 26102 <======================
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware NSX
This issue occurs because the client application does not transmit a TCP FIN packet to explicitly terminate the connection. Under the default Firewall configuration, a half-closed TCP connection is kept alive for 15 minutes (900 seconds). As the client sequentially increments its ephemeral source ports for new connections, a new TCP SYN packet overlaps with a residual half-closed TCP connection that still exists in the Firewall's state table. The Firewall subsequently drops the new SYN packet due to a state mismatch. This is an expected Firewall behavior.
Create a new session timer profile with the TCP Closing timer adjusted to a value less (say 2 minutes) than the default of 15 minutes for the Firewall to purge the half-open connection therefore allowing any new connection to be allowed by the Firewall.
The required procedural steps are officially documented within the Broadcom Administration Guide Create a Session Timer