Reduced TCP throughput and TCP connection resets on EdgeSWG versions prior to 7.3.14.2
search cancel

Reduced TCP throughput and TCP connection resets on EdgeSWG versions prior to 7.3.14.2

book

Article ID: 367244

calendar_today

Updated On:

Products

Advanced Secure Gateway Software - ASG ISG Proxy ProxySG Software - SGOS SG-300 SG-600 SG-S200 SG-S200-40 SG-S200-RP SG-S400 SG-S400-RP SG-S500 SG-S500-RP SG-VA SGVA SG-VA-DEMO ASG-S200 ASG-S400 ASG-S500

Issue/Introduction

Edge SWG devices running an SGOS 7.3 version prior to 7.3.14.2 that have been up for 49 days or longer may experience throughput or connection issues. Access log delivery utilizing continuous mode with persistent TCP connections may find logging slow down or have connections to the logging server reset. In addition, customers may also experience TCP disconnects or slow performance with other applications under the same circumstances.

Cause

The underlying network stack in SGOS relies upon a timing variable counter of 1ms. This counter is primarily used to make timing calculations for TCP/IP related to round trip time and idle flow timeouts.

In SGOS version 7.3.x, these timing variables were permitted to be stored as a 64-bit data type, but in some cases the counters were compared against 32-bit values that were reset after 49.7 days.

This issue can surface in several ways:

  1. The SG may terminate idle TCP flows early by sending a TCP RST packet. This typically impacts long-lived idle connections like those used for access logging in continuous mode upload.  The SG may also terminate flows immediately after the TCP handshake however in most cases this will not be noticed by end users since the Edge SWG will quickly reconnect.
  2. TCP flows that experience dropped or reordered packets, due to network congestion, may experience reduced throughput.   When this occurs the Edge SWG reduces the TCP congestion window to 4KB which can reduce the throughput to as low as 1Mbps for that flow in typical network conditions.
  3. An increase in internal timer processing can result in a short-term increase in CPU utilization of the Edge SWG device.

Resolution

SGOS 6.7.x versions are not affected.

This issue affects all versions of SGOS 7.3.x up to and including version 7.3.14.2.
The issue only occurs after the Edge SWG device has been running for 49 days.
The issue can cause idle TCP persistent connections to be reset or a reduction of throughput for some TCP connections.
In most cases, end users will not notice any issues but there may be exceptions depending on the application.

Broadcom has corrected this issue in the following versions
7.3.14.x where x is equal to or greater than 3.
7.3.x where x is equal to or greater than 15.
7.4.x where x is equal to or greater than 2.

Workaround

To prevent this issue, reboot Edge SWG devices before a 49-day uptime.