NSX workload & edge throughput performance tuning
search cancel

NSX workload & edge throughput performance tuning

book

Article ID: 402459

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

If you have a single workload vm or edge, that is experiencing packet loss, a lot of re-transmits, or increased latency in the packets due to high packets per second (pps) it can possibly be caused by the default configuration of the network card driver. 

Environment

This will need to be done in environments where we have a few high bandwidth vms with very high send and receive pps e.g edges or database vms in proportion to the rest of the environment. 

Cause

Using default driver configuration, the TX and RX queues for processing packets are spread evenly across multiple VMs.  However, in situations where NSX Edges and high packet processing VMs (DBs, file servers, virtual firewalls, virtual routers, etc.), the queues do not need to be spread across multiple workloads.  Instead, more queues need to be allocated to single VMs, allowing the high packet processing VMs to have the power necessary to move their packets along faster.

Resolution

For latency specific workloads you will want to make sure that the values below are configured in the vm's vmx file to get the most bi-directional throughput out of your vnic. For high bandwidth workload vms it is also important that they have an adequate amount of vcpu's applied to them since the threads for processing their queues are going to be allocated from their vcpu allotment. 
Latency Tuning

Once the above values have been edited you can use the document below to optimize the RSS queues on the pnic to better distribute the packet processing load among the available cores for single high bandwidth vms. The physical nic queue allotment will be allocated from the host cpu threads itself for each queue. 
Driver Tuning

 

Queuing and Buffers at vNIC Layer

  • In the vms vmx file edit or append the below parameters.  
    • Transmit Queuing: ethernetX.ctxPerDev = “3”
    • Receive Queuing: ethernetX.pnicFeatures = “4
    • Receive Queueing for UDP packets: ”ethernetX.udpRSS="1"
  • On the esxi host in question edit the phsyical nic ring buffer or consult your hardware vendor for recommended ring buffer values depending on the network card
  • esxcli network nic ring current set -n vmnic# -r 4096 -t 4096

    Note: The options above relate to transmit and receive queues, and UDP, respectively. For the transmit queues:
    ethernetX.ctxPerDev = "1" - one TX thread per vNIC / port.
    ethernetX.ctxPerDev = "2" - one TX thread per VM (default).
    ethernetX.ctxPerDev = "3" - one TX thread per vNIC per queues
    For receive queues:
    ethernetX.pnicFeatures = "4" -  (RSS) or "5" (LRO+RSS) enabled.
    ethernetX.udpRSS="1" - Enables UDPRSS

Queuing and Buffers at pNIC / ESXi Stack Layer

  • Buffers: 
    • esxcli system settings advanced set -o /Mem/ShareCOSBufSize -i 32
  • Receive Queuing: This is specific for bnxtnet network cards, depending on your hardware vendor this driver will change.The first command will get the driver.
    • esxcli network nic list 
    • esxcli system module parameters set -m bnxtnet -p ‘DRSS=8′
  • Transmit Queuing:
    • esxcli system settings advanced set -i 1 -o /Net/NetSchedHClkMQ
    • esxcli system settings advanced set -i 4 -o /Net/NetSchedHClkMaxHwQueue
    • esxcli system settings advanced set -i 1 -o /Net/NetSchedHClkVnicMQ  

Ensure VM is not moved out from queueing

  • esxcli system settings advanced set -o /Net/NetNetqLoadAvgPktCountShift -i 30
  • esxcli system settings advanced set -o /Net/NetNetqLoadAvgByteCountShift -i 50

Please refer to the kb below for validating the performance within the edge appliance itself or make a case with support so we can assist you with that validation.
Troubleshooting NSX Edge and Virtual Machine (VM) Performance

 

Additional Information

If you have any questions regarding the performance tuning for NSX workloads or edges or whether these changes will be necessary in your environment please make a case and we will assist with collecting the required data to see how to best improve the performance of your workloads and edges. 
Uploading files to cases on the Broadcom Support Portal

Creating and managing Broadcom support cases