Service Engine on Azure may function with low performance if DPDK mode is enabled but necessary modules fail to load on SE startup
search cancel

Service Engine on Azure may function with low performance if DPDK mode is enabled but necessary modules fail to load on SE startup

book

Article ID: 391856

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

  • Traffic Performance issue after upgrading to 22.1.6-2px version.
  • DPDK in Azure Mellanox environments will not function without loading the needed kernel modules. 

Environment

  • Azure Cloud
  • Avi Load Balancer version 22.1.6

Cause

If kernel modules fail to load, the Service Engine (SE) will boot in DPDK-DTAP mode, which is more restrictive than PCAP mode and can severely impact datapath functionality.

To verify this, log in to the affected SE and inspect the following log:

 
[admin:] > attach serviceengine <SE_NAME>
admin@Avi-se:~$ sudo -i
[sudo] password for admin: <enter password>
 
root@Avi-se:~# vi /var/log/upstart/se_info.log

Look for errors related to net_mlx5 ,Failed to attach etc.

Core:0 11/19/24 09:37:11.729162 UTC EAL: memzone_reserve_aligned_thread_unsafe mz->iova:000000040e7aad80, mz->addr:0x6001cb9aad80 mz->name:mlx5_pmd_shared_data mz:0x6000000147e0 len:0
Core:0 11/19/24 09:37:11.729447 UTC net_mlx5: mlx5.c:3347: mlx5_pci_probe(): no Verbs device matches PCI device eda7:00:02.0, are kernel drivers loaded?
Core:0 11/19/24 09:37:11.729474 UTC EAL: Driver cannot attach the device (eda7:00:02.0)
Core:0 11/19/24 09:37:11.729479 UTC EAL: Failed to attach device on primary process
Core:0 11/19/24 09:37:11.729485 UTC net_failsafe: sub_device 0 probe failed (No such file or directory)
Core:0 11/19/24 09:37:11.729494 UTC EAL: Inserting args remote=eth0 device net_tap_vsc0
Core:0 11/19/24 09:37:11.729916 UTC tun_alloc(): Rx trigger disabled: Device or resource busy 

If the above kernel-related log entries are not present, the issue is likely NOT related to kernel module loading.

Resolution

Workaround:

To resolve the issue, enable the se_dpdk_pmd knob in the Service Engine (SE) group and then reboot all Service Engines.

Command:

[admin:ctrl]: > configure serviceenginegroup Default-Group 

[admin:ctrl]: serviceenginegroup> se_dpdk_pmd 2 

[admin:ctrl]: serviceenginegroup> save 

 


PCAP Mode:

All Service Engines in the group will operate in PCAP mode. This setting forces every SE in the group to use PCAP mode. Additionally, after upgrading from a previous release, the se_dpdk_pmd will be set to the new value for all existing Service Engine Groups.

Important:
If the mode is changed for a Service Engine Group containing SEs, you must reboot the SEs for the new setting to take effect.

Reference Documentation:
https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/avi-load-balancer/avi-load-balancer/22-1/vmware-avi-load-balancer-installation-guide/installing-nsx-alb-in-microsoft-azure/additional-deployment-options-microsoft-azure-/service-engine-group-configuration/dpdk-support-for-service-engines-in-azure.html

 

Issue Identification and Fix:

The issue has been identified as a bug, and a fix has been included in the following versions and patches:

Fix Versions:

  • 22.1.7-2p7
  • 31.2.1
  • 30.2.3
  • 31.1.2

Bug ID:

  • AV-223059

Release Notes:
For detailed information about the fix, refer to the Release Notes for 22.1.6.