Kubernetes pod restarts multiple times with ERROR_DPDK_DEV_START
search cancel

Kubernetes pod restarts multiple times with ERROR_DPDK_DEV_START

book

Article ID: 345710

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
When a POD associated to multiple DPDK interfaces is instantiated it crashes several times with the following error signature:
{"version":"0.2.0","timestamp":"2023-06-21T10:15:18.935+02:00","severity":"info","service_id":"pod-up-data-plane","metadata":{"proc_id":"8"},"message":"[pio] USER1: Port 2 flags - multicast 0x1, promisc 0x0!"}
{"version":"0.2.0","timestamp":"2023-06-21T10:15:18.936+02:00","severity":"info","service_id":"pod-up-data-plane","metadata":{"proc_id":"8"},"message":"[pio] vmxnet3_dev_start(): Device activation: UNSUCCESSFUL"}
{"version":"0.2.0","timestamp":"2023-06-21T10:15:18.936+02:00","severity":"error","service_id":"pod-up-data-plane","metadata":{"proc_id":"8"},"message":"[pktio_libpio_init] pio_init() => ERROR_DPDK_DEV_START"}
Log snippet:
 
Further analysis of the logs at vmkernel.log indicate an issue performing memory reservation for the interface:

2023-06-27T02:11:46.482Z cpu68:33812799)VmMemCow: 1772: p2m update: cannot reserve - cur 0 0 rsvd 1029 req 257 avail 1279
2023-06-27T02:11:46.482Z cpu68:33812799)Vmxnet3: 11366: Failed to map the rx data ring for rq 0



Cause


This can happen when multiple vnics are enabled at same time and memory reservation becomes a problem.

NOTE: This issue is observed on ESXi 7 releases previous to ESXi 7.0.3 patch.

Resolution


- To address the issue , please upgrade to ESXi 7.0.3 patch or to ESXi 8.0 releases (A fix is already present in ESXi 8.x)

- If upgrade is not an option, please proceed with the workaround as mentioned the following section.




Workaround:

The following workaround can be applied:

- Power off VM(s) on the hosts OR Migrate all the Powered ON virtual machines from the host on which you need to make changes to different ESXi hosts in the cluster.

- From UI, go to Host > Manage > Advanced Settings

Search for ShareCOSBufSize

- Edit the value to 32 (Maximum supported size)

- Please make sure the new value is updated under the value section.

- Power on the VM(s) OR Migrate the virtual machines back to the host on which changes were made.