Low network performance when using NICs with bnxtnet driver for GENEVE traffic. And high "LRO aborts rx" is observed in the NIC's private stats.
[rxqN] LRO byte rx: 362072308778
[rxqN] LRO events rx: 32360186
[rxqN] LRO aborts rx: 33009523 <-----
NIC private stats can be found in nicinfo.sh.txt in the ESXi support bundle, or by the the following command:
localcli --plugin-dir /usr/lib/vmware/esxcli/int networkinternal nic privstats get -n <vmnicX>
LRO (large receive offload) improves network performance by combining multiple received packets, thus reducing total overhead as they traverse the network stack. The BCM5741x adapters do not support hardware LRO (large receive offload) functionality for GENEVE packets. The BCM5750x adapters support hardware LRO for GENEVE packets only if the C-bit in the GENEVE header is not set. The adapters tries to perform but aborts hardware LRO for these packets, resulting in the software stack having to process a larger number of packets and this higher overhead on software stack leads to lower performance.
Enable software LRO by disabling hardware LRO in bnxtnet, using disable_tpa bnxnet driver parameter. Below is the command to disable hardware LRO with 4 bnxtnet NIC ports. Reboot the host for it to take effect.
esxcli system module parameters set -m bnxtnet -p 'disable_tpa=1,1,1,1'