If you have a single workload vm or edge, that is experiencing packet loss, a lot of re-transmits, or increased latency in the packets due to high packets per second (pps) it can possibly be caused by the default configuration of the network card driver.
This will need to be done in environments where we have a few high bandwidth vms with very high send and receive pps e.g edges or database vms in proportion to the rest of the environment.
Using default driver configuration, the TX and RX queues for processing packets are spread evenly across multiple VMs. However, in situations where NSX Edges and high packet processing VMs (DBs, file servers, virtual firewalls, virtual routers, etc.), the queues do not need to be spread across multiple workloads. Instead, more queues need to be allocated to single VMs, allowing the high packet processing VMs to have the power necessary to move their packets along faster.
For latency specific workloads you will want to make sure that the values below are configured in the vm's vmx file to get the most bi-directional throughput out of your vnic. For high bandwidth workload vms it is also important that they have an adequate amount of vcpu's applied to them since the threads for processing their queues are going to be allocated from their vcpu allotment.
Latency Tuning
Once the above values have been edited you can use the document below to optimize the RSS queues on the pnic to better distribute the packet processing load among the available cores for single high bandwidth vms. The physical nic queue allotment will be allocated from the host cpu threads itself for each queue.
Driver Tuning
ethernetX.ctxPerDev = “3”ethernetX.pnicFeatures = “4esxcli network nic ring current set -n vmnic# -r 4096 -t 4096ethernetX.ctxPerDev = "1" - one TX thread per vNIC / port.ethernetX.ctxPerDev = "2" - one TX thread per VM (default).ethernetX.ctxPerDev = "3" - one TX thread per vNIC per queuesethernetX.pnicFeatures = "4" - (RSS) or "5" (LRO+RSS) enabled.ethernetX.udpRSS="1" - Enables UDPRSSesxcli system settings advanced set -o /Mem/ShareCOSBufSize -i 32esxcli network nic list esxcli system module parameters set -m bnxtnet -p ‘DRSS=8′esxcli system settings advanced set -i 1 -o /Net/NetSchedHClkMQesxcli system settings advanced set -i 4 -o /Net/NetSchedHClkMaxHwQueueesxcli system settings advanced set -i 1 -o /Net/NetSchedHClkVnicMQ esxcli system settings advanced set -o /Net/NetNetqLoadAvgPktCountShift -i 30esxcli system settings advanced set -o /Net/NetNetqLoadAvgByteCountShift -i 50Please refer to the kb below for validating the performance within the edge appliance itself or make a case with support so we can assist you with that validation.
Troubleshooting NSX Edge and Virtual Machine (VM) Performance
If you have any questions regarding the performance tuning for NSX workloads or edges or whether these changes will be necessary in your environment please make a case and we will assist with collecting the required data to see how to best improve the performance of your workloads and edges.
Uploading files to cases on the Broadcom Support Portal
Creating and managing Broadcom support cases