TCP throughput degradation on SmartNIC setup when running on Nvidia Bluefield 2 SmartNIC vs. performance NIC.
The maximum window scale supported by BF2 SmartNIC is 7. When a larger scale is used by the guest, TCP connections won't be able to offload onto hardware, hence cause performance degradation.
This issue is resolved in ESXi 8.0b.
Workaround:
To workaround the issue, please follow the below mentioned instruction:
On Linux virtual machine, reduce the wmem_max and rmem_max:
echo 'net.core.wmem_max=4194304' >> /etc/sysctl.conf
echo 'net.core.rmem_max=4194304' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 87380 4194304' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 87380 4194304' >> /etc/sysctl.conf
sysctl -p