When a virtual machine on ESXi uses the VMXNET3 network adapter, significant packet loss may occur during periods of high network traffic or traffic bursts. Large numbers of packets can be dropped within the Guest OS, and this issue can persist even after the Guest OS network buffers have been increased to their maximum values, as illustrated below.
VM network stats:
client dvPort client client pktsTx pktsRx dropped dropped 1stRing 1stRing 2ndRing 2ndRing OutOf Name Id Type SubType portset port pktsTx M-cast M-cast Tx Rx Size Full Size Full Buffers ------ ------ ------ ------- ------- ---- ------ ------ ------ ------- ------- ------- ------- ------- ------- ------- Linux Test VM01.eth0 1336 VMM Virtual NIC Vmxnet3 Client DvsPortset-1 988777666 1055681761507 194543 32572796 8 590668256 4096 19249161 512 0 19249161
Note: For more information on how to determine and increase the VMXnet3 ring buffer values in Guest OS, please refer to the following KB: 324556
vSphere ESXi
These packet drops can occur when multiple pollWorlds deliver packets to the same vNIC receive queue. VMXNET3 has an upper limit (default: 256) on the number of packets that can be queued for processing before being delivered to the Guest OS. If the incoming packet rate exceeds this limit, any additional packets beyond the queue capacity will be dropped.
Try increasing the queue size on the ESXi host running this VM from the default 256 to 512 or 1024 using the commands below. Both the queue size and processing batch size work together and need to be adjusted, though the exact steps depend on the ESXi version.
The current value can be verified with the below commands:
OR
esxcfg-info -a|----Option Name..................................Vmxnet3RxPollBound|----Current Value................................256|----Default Value................................256|----Min Value....................................0|----Max Value....................................4096|----Hidden.......................................false|----Parent......................................./Net/|----Path........................................./Net/Vmxnet3RxPollBound
On version 7.x
The value of the advanced configuration option Vmxnet3RxPollBound controls both the processing batch size (Poll) and the software queue size (Queue). To change the values, use this command:
On versions 8.0+
These advanced configuration values are modified separately, and it is recommended to increase the Queue size to double the Poll size, bearing in mind the max is 4096. To change the values, use these commands:
Once the changes have been made, the vNIC needs to be reset or VM has to be powered off and on.
Disclaimer: Increasing the default queue size may lead to higher latency. If the issue persists after adjusting the queue size, or if latencies become longer than expected, a support case may be required.