search cancel

ProxySG/ ASG hangs or goes unresponsive intermittently after 6.7.x upgrade

book

Article ID: 173767

calendar_today

Updated On:

Products

Advanced Secure Gateway Software - ASG ProxySG Software - SGOS

Issue/Introduction

ProxySG/ ASG hangs or goes unresponsive intermittently after 6.7.x upgrade. It may appear to be hang or unresponsive due to loss of ARP for default GW or failing to initialize WCCP or Interface being unreachable over Ping. But if downgraded back, it works without showing this behavior . Also when this happens, ProxySG/ASG appears to be unresponsive over SSH or web UI, but responds properly over serial console.

Cause

In SGOS 6.7.x, there are 2 new features introduced called TSO (Transmit segment offload) and Hardware checksum offload (Transmit checksum). These features are enabled by default upon upgrading from SGOS 6.5.x or 6.6.x . More information on TSO can be found here and hardware checksum offload can be found here 

While these features would help to improve performance of TCP\IP stack of SGOS by offloading these tasks to NIC card (SG's Network Adaptor), In some deployment it has been observed that the NIC card's transmit (TX) queue gets full and packet gets dropped or not processed in a timely manner. In other words, packet does not leave from the SG. When this situation happens and packets like ARP request does not leave from  SG's NIC, the device will lose connection towards default gateway. This will make the SG unreachable from outside network and as a result it may appear hang or unresponsive over the network but will respond via serial console. Without any change when downgrading back to SGOS 6.5.x or 6.6.x this problem is resolved

when proxySG/ASG has following conditions true, its more likely that the SG might encounter this problem

  1. The device has active 10G Fiber / copper NIC
  2. Deployment with high volume of intercepted and / or bypassed packets on that 10G NIC .

Note 1 -if the ProxySG/ASG has more than one active interface other than the 10G interface (i.e int 0:0 as management interface) , It would be reachable via that interface while this issue occurs.

Note 2 - There are no logs (i.e sysinfo , eventlog , snapshot) that would indicate this problem other than the full memory core. Full core needs to be obtained from the device when the device or the 10G NIC is in hung or unresponsive state.

Resolution

This issue is a known bug 260654. Please refer to the latest release note of SGOS 6.7.x. A fix is available with SGOS 6.7.3.11 and later 6.7.3.x SGOS versions. Also SGOS 6.7.4.1 and later 6.7.4.x SGOS versions. On these SGOS versions, CLI command has been added to make TSO and hardware checksum as a configurable option. After upgrading on this SGOS version to obtain this fix , apply below CLI commands

#conf t
#(config)tcp-ip tcp-tso disable
#(config)tcp-ip transmit-checksum disable

 

Note 1 - While these features are disabled , these tasks are still being performed by SGOS TCP/IP stack instead of the proxySG/ASG's NIC.

Note 2 - These CLI commands are hidden CLI commands and will not be displayed under available CLI commands with '?' or on a attempt to auto populate by pressing tab. When these changes are made, it is stored in SG's configuration permanently and preserved upon reboot or upgrade to higher SGOS versions.

Please also run the command found in 173782