Beginning from SGOS 6.7.x, there are 2 new features introduced called TSO (Transmit segment offload) and Hardware checksum offload (Transmit checksum). These features are enabled by default. More information on TSO can be found here and hardware checksum offload can be found in the resource doc. with the URL below.
While these features would help to improve the performance of the TCP\IP stack of SGOS by offloading these tasks to the NIC card (SG's Network Adaptor), In some deployments it has been observed that the NIC card's transmit (TX) queue gets full and packet gets dropped or not processed in a timely manner. In other words, the packet does not leave the SG/ASG. When this situation happens and packets like ARP request does not leave from SG's NIC, the device will lose connection to the default gateway. This will make the SG unreachable from outside the network and as a result it may appear to hang or unresponsive over the network but will respond via serial console. Without any change when downgrading back to the previous SGOS version, this problem would be resolved. Cold bootup would also appear to resolve the issue.
When proxySG/ASG has the following conditions true, it's more likely that the SG might encounter this problem
- The device has an active 10G Fiber/copper NIC
- Deployment with a high volume of intercepted and/or bypassed packets on that 10G NIC.
Note 1: if the ProxySG/ASG has more than one active interface other than the 10G interface (i.e int 0:0 as management interface), It would be reachable via that interface while this issue occurs.
Note 2: There are no logs (i.e sysinfo file/snapshot, eventlog) that would indicate this problem other than the full memory core. Full core needs to be obtained from the device when the device or the 10G NIC is in a hung or unresponsive state.
In the SGOS 18.104.22.168, we recommend implementing the CLI command set below, to disable these features.
#(config)tcp-ip tcp-tso disable
#(config)tcp-ip transmit-checksum disable
Note 3 - While these features are disabled, these tasks are still being performed by SGOS TCP/IP stack instead of the proxySG/ASG's NIC.
Note 4 - These CLI commands are hidden CLI commands and will not be displayed under available CLI commands with '?' or on an attempt to auto-populate by pressing the tab key. When these changes are made, it is stored in SG's configuration permanently and preserved upon reboot or upgrade to higher SGOS versions.
Also, to prevent the ASG from responding slowly to user traffic, we recommend also disabling LRO by running the CLI command set below.
#(config)tcp-ip tcp-lro disable
Note 5: The observed issue isn't a bug and, looking at the logs and from the heartbeat reaching the backend, we have also confirmed that this was not a crash. S
We expect the recommended changes to resolve the issue permanently. Monitor the changes for a few days and let Technical Support know, should you have further related queries. Please note that this issue is one that would rarely recur.