A system crash occurs during flow deletion and GRO table cleanup, specifically when an IPv6 Virtual Server (VS) flow that was marked for deletion has a trailing packet.
Symptoms:
In such a scenario, Service Engine can crash with a segmentation fault at rte_pktmbuf_free
To investigate further, you can review the latest stack traces from the Controller or SE by accessing the following path:
CLI:
Log in to the Controller via SSH and run this command. Please note you have to replace the name of the se_dp file here.
root@<Controller ip>:# cat /opt/avi/archive/stack_traces/<se_dp.timestamp>.stack_trace
UI:
Navigate to Administration > Support > Crash Reports > Expand the latest crash file.
Avi Version: 30.2.2
IPV6 Virtual service
The core issue stems from a race condition and stale memory access during the cleanup of IPv6 Virtual Service flows.
Trailing Packet Anomaly: A trailing packet arrives for an IPv6 VS flow that has already been marked for deletion. This packet puts the Generic Receive Offload (GRO) layer in an inconsistent state.
Stale Memory Access during Cleanup: Subsequently, during the process of deleting this flow and cleaning up its associated GRO table entries, the system attempts to access memory that has either been deallocated or contains outdated information. This stale memory access leads to the system crash.
Workaround: Disable GRO in the service engine group.
The workaround is to disable GRO in the service engine group settings.
Please refer to the commands below. The command will toggle the disable_gro value from True to False.
Log in to the CLI of the Avi Controller
shell
configure serviceenginegroup <se_group_name>
disable_gro
save
Verify the changes:
show serviceenginegroup <se_group_name> | grep disable_gro
The above command should return false as below.
| disable_gro | False |
Note: Reboots of service engines are not required.
The issue has been fixed in 30.2.2-2p6
Note: Reboot of service engines are not required.