Large Packet Loss Observed in VM's Using VMXNET3 on ESXi Even After Increasing Guest OS Network Buffers to Maximum
search cancel

Large Packet Loss Observed in VM's Using VMXNET3 on ESXi Even After Increasing Guest OS Network Buffers to Maximum

book

Article ID: 427211

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

When a virtual machine on ESXi uses the VMXNET3 network adapter, significant packet loss may occur during periods of high network traffic or traffic bursts. Large numbers of packets can be dropped within the Guest OS, and this issue can persist even after the Guest OS network buffers have been increased to their maximum values, as illustrated below.

VM network stats:

client               dvPort    client            client                                                  pktsTx    pktsRx     dropped  dropped    1stRing  1stRing   2ndRing  2ndRing  OutOf
Name                 Id        Type              SubType         portset       port       pktsTx         M-cast    M-cast     Tx       Rx         Size     Full      Size     Full     Buffers
------               ------    ------            -------         -------       ----       ------         ------    ------     -------  -------    -------  -------   -------  -------  -------
Linux Test VM01.eth0 1336      VMM Virtual NIC   Vmxnet3 Client  DvsPortset-1  988777666  1055681761507  194543    32572796   8        590668256  4096     19249161  512      0        19249161

Note: For more information on how to determine and increase the VMXNET3 ring buffer values in Guest OS, please refer to the following KB: 324556

Environment

 

  • VMware vSphere ESXi 7.x

  • VMware vSphere ESXi 8.0 and later

 

 

Cause

These packet drops can occur when multiple pollWorlds deliver packets to the same vNIC receive queue. VMXNET3 has an upper limit (default: 256) on the number of packets that can be queued for processing before being delivered to the Guest OS. If the incoming packet rate exceeds this limit, any additional packets beyond the queue capacity will be dropped.

Resolution

To resolve this, you must increase the queue size on the ESXi host running the affected VM. Both the queue size and the processing batch size must be adjusted. The required steps vary depending on your ESXi version.

Step 1: Verify Current Configuration

You can verify your current queue and poll bounds using the following commands via SSH on the ESXi host:

Bash
 
esxcfg-advcfg --get /Net/Vmxnet3RxPollBound
esxcfg-advcfg --get /Net/Vmxnet3RxQueueBound

Alternatively, running esxcfg-info -a will display the configuration tree:

Plaintext
 
|----Option Name..................................Vmxnet3RxPollBound
|----Current Value................................256
|----Default Value................................256
|----Min Value....................................0
|----Max Value....................................4096
|----Hidden.......................................false
|----Parent......................................./Net/
|----Path........................................./Net/Vmxnet3RxPollBound

Step 2: Increase Queue and Poll Sizes

Apply the appropriate commands for your ESXi version.

For ESXi 7.x: In version 7.x, the Vmxnet3RxPollBound option controls both the processing batch size (Poll) and the software queue size (Queue). Set this value to 512 (or 1024 if required):

Bash
 
esxcfg-advcfg --set 512 /Net/Vmxnet3RxPollBound

For ESXi 8.0+: In version 8.0 and later, these values are configured separately. It is recommended to set the Queue size to double the Poll size (maximum limit is 4096).

Bash
 
esxcfg-advcfg --set 1024 /Net/Vmxnet3RxQueueBound
esxcfg-advcfg --set 512 /Net/Vmxnet3RxPollBound

Step 3: Apply the Changes

For the new buffer settings to take effect, the vNIC must be reset. The simplest way to achieve this is to power cycle the virtual machine:

  1. Gracefully shut down the Guest OS.

  2. Power the Virtual Machine back on.

 

Disclaimer: Increasing the default queue size allocates more memory and may lead to higher baseline latency for network traffic. Monitor the VM's performance after making these changes. If packet loss persists or if network latency becomes unacceptably high, please open a support case for deeper analysis.

Additional Information

Disclaimer: Increasing the default queue size allocates more memory and may lead to higher baseline latency for network traffic. Monitor the VM's performance after making these changes. If packet loss persists or if network latency becomes unacceptably high, please open a support case for deeper analysis.