High CPU Utilization on ENS lcores
search cancel

High CPU Utilization on ENS lcores

book

Article ID: 418323

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • ESXi hosts running NSX with Enhanced Network Stack (ENS) experience high CPU utilization
  • Symptoms include potential packet drops or performance degradation for workloads

Environment

  • 4.x, 4.2.1, 9.0

Cause

  • The most common cause is an unbalanced traffic load across the ENS lcores.
    • This often happens when the default lcore assignment mode, vNIC-count, is used with asymmetric traffic.
    • In this mode, lcores are assigned an equal number of vNICs.
    • If one vNIC (and its VM) handles significantly more traffic than others, the lcore assigned to it becomes a bottleneck, resulting in high CPU utilization for that specific core.

Resolution

Collect the following information and then use the Creating and managing Broadcom support request (SR) cases KB to engage Broadcom support:

  1. Validate Host CPU utilization

    1. Log in to the affected ESXi host CLI.

    2. Run esxtop and press c to view the CPU utilization screen.

    3. Identify high utilization (e.g., >80% PCPU USED) associated with ENS data plane related worlds.

  2. Run the following commands from nsxcli

    • get ens dev affinity list

    • get ens switch list

    • get transport-node nsx-stats module host-fastpath-ens-lcore counters all

    • get ens flow-stats <switch-id-arg> <lcore-ID-arg>

    • get ens lcore-assignment-mode <hs-name-arg>

    • get ens port list <switch-id-arg>

    • get ens latency lcore config <switch-id-arg>

    • get ens latency lcore dump <switch-id-arg>

  3. Collect NSX logs for the Management Nodes

Additional Information