HCX - Network Extension (NE) appliance high CPU condition
search cancel

HCX - Network Extension (NE) appliance high CPU condition

book

Article ID: 388224

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • The performance between VMs using the network extended by the HCX Network Extension (NE) appliance is affected.
  • You are experiencing packet drops when the NE CPU reaches 100%.
  • In vCenter, the performance chart shows a high percentage of CPU usage:
  • From the top command on the NE appliance (using Shift + C), the process consuming the most CPU is related to the IPSec (ksoftirqd) service:
    SSH into the HCX Manager
    ccli > list > go <NE_ID#> ssh > top

 

Cause

High CPU due to the software process ksoftirqd indicates that traffic load is causing heavy per-packet IPsec encryption processing on the NE appliance VM.

  • Amount of VLANs being extended
    The HCX NE supports up to 8 extended VLANs, but large VLANs with many VMs can increase packet traffic. Each packet requires CPU-intensive IPsec encryption and decryption. As the number of packets rises, CPU consumption also increases potentially impacting the performance. 

  • Migrated VMs communicating with on-premises gateway
    When VMs are migrated to the cloud and need to communicate with other VMs on a different segment, their packets are sent to the on-premises gateway to reach the destination back in the Cloud. In this scenario, packets travel from the Cloud to on-premises and back to the Cloud, creating unnecessary processing overhead. This round-trip increases the number of packets that IPsec must process for encryption and decryption, which raises the CPU consumption on the HCX NE appliance. 

Resolution

To avoid such scenarios it is important to mitigate which of the following recommendations can help to address the issue.

  1. HCX NE appliance supports extending up to 8 networks using the same appliance. If you have multiple VLANs extended on the same appliance, you can use different appliances to balance the load between the NEs appliances to reduce the CPU utilization.
  2. Identify the traffic load causing the high CPU utilization. If the traffic is Layer 2 (same subnet) between the sites, consider migrating the VMs to the same site so that HCX will no longer use the NE appliance for this communication. If the traffic is Layer 3 (routing), consider migrating the VMs to the same site and enabling MON (Mobility Optimized Networking). For more information: Understanding Network Extension with Mobility Optimized Networking
  3. If your underlay network is secure, you may consider disabling IPsec encryption on the NE appliance by selecting "The underlay is secure" in the network profile and Service Mesh. For more information, refer to: Create a Network Profile.

Additional Information

VMware HCX 4.11.0 Configuration Limits
4-6+ Gbps per HCX Network Extension Appliance. Observed performance in similar environments can vary depending on factors like MTU, Latency, Environment Traffic, Network Bandwidth, CPU capacity, and Memory resources.