Users complaining about poor performance and connection errors trying to browse Web sites
All users impacted and issue not specific to which on-premise proxy is hit
Users accessing secure web sites seem to be impacted - generates traffic over TCP 8080 and 8084
Users accessing HTTP sites do not seem impacted - this generates traffic over TCP 8443
No changes were done on client network, and no apparent changes done on WSS side triggering this.
Early morning and late evenings, the performance seems to improve and be acceptable.
Proxy forwarding access method into WSS for TCP ports 8080, 8443 and 8084
Users hit a local load balancer that fronts two on-premise proxies
Each on premise proxies configured to egress from one IP address to proxy.threatpulse.net
Fortigate NAT firewall responsible for NATing requests into WSS
WSS sees requests coming two different IP addresses and sends them to multiple pods per TCP port.
Mismatch in TCP state table between Fortigate NAT firewall and WSS
1. Instead of configuring the on premise proxy to forward traffic into WSS on proxy.threatpulse.net, we had them forward traffic into a forwarding group that contained 4 IP addresses / DNS names
- https://knowledge.broadcom.com/external/article/208150 includes an example of how this is done: We added ggblo1-vip.threatpulse.net, ggblo2-vip.threatpulse.net, ggblo3-vip.threatpulse.net, ggblo4-vip.threatpulse.net
- use round robin as the load algorithm
- make sure session affinity is enabled based on client IP address
2. Increased the Fortigate half closed timer to 300 seconds
- https://community.fortinet.com/t5/No-tags-TKBs/Technical-Tip-How-to-extend-the-TCP-Half-Close-timer-for/ta-p/190553?externalID=FD36021 outlines this
When the issue happened, the Fortigate would report that a number of connections were getting nothing back from WSS VIP.
Getting the source port for these connections and filtering WSS captures, we could see that the TCP SYN requests were failing to get a response
- blanked out IP address is customer request into WSS
- 10.230.0.133 is NATed IP address to send request upstream to proxy
Tracking the state of this connection on the WSS proxy, we can see it is in use and in LAST_ACK TCP state We are awaiting an ACK from the TCP FIN that was sent out.