VMXNET3 Driver Fails Due to Kernel Memory Allocation Error in RHEL Guest OS
search cancel

VMXNET3 Driver Fails Due to Kernel Memory Allocation Error in RHEL Guest OS

book

Article ID: 406203

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Virtual machine running on RHEL Guest OS, randomly freezes when VMware tools automatically stops running.

     
  • The virtual machine's vmware.log file contains entries similar to:

    YYYY-MM-DDThh:mm:ss.558Z Wa(03) vmx 177a928a-99-3175 ToolsTimeSyncRpcCompletionRoutine: Timeout while sending guestrpc
    YYYY-MM-DDThh:mm:ss.737Z In(05) vmx 177a928a-99-3175 GuestRpcSendTimedOut: message to toolbox timed out.

  • When checking the dmesg logs , inside Guest OS, we find below mentioned entries

    [    6.474184] vmxnet3 0000:0b:00.0 eth0: intr type 3, mode 0, 5 vectors allocated
    [    6.475497] vmxnet3 0000:0b:00.0 eth0: NIC Link is Up 10000 Mbps
    [    6.697022] vmxnet3 0000:0b:00.0 eth0: intr type 3, mode 0, 5 vectors allocated
    [    6.698412] vmxnet3 0000:0b:00.0 eth0: NIC Link is Up 10000 Mbps
    [    6.718655] WARNING: CPU: 2 PID: 1388 at mm/page_alloc.c:4987 __alloc_pages_nodemask+0x2a5/0x330
    [    6.718764] PKRU: 55555554
    [    6.718765] Call Trace:
    [    6.718771]  ? __warn+0x94/0xe0
    [    6.718777]  ? __alloc_pages_nodemask+0x2a5/0x330
    [    6.718779]  ? __alloc_pages_nodemask+0x2a5/0x330
    [    6.718781]  ? report_bug+0xb1/0xe0
    [    6.718785]  ? do_error_trap+0x9e/0xd0
    [    6.718789]  ? do_invalid_op+0x36/0x40
    [    6.718791]  ? __alloc_pages_nodemask+0x2a5/0x330
    [    6.718792]  ? invalid_op+0x14/0x20
    [    6.718797]  ? __alloc_pages_nodemask+0x2a5/0x330
    [    6.718799]  ? __alloc_pages_nodemask+0x166/0x330
    [    6.718801]  __dma_direct_alloc_pages.constprop.25+0x8a/0x120
    [    6.718805]  dma_direct_alloc+0x5a/0x290
    [    6.718808]  vmxnet3_rq_create.part.61+0x17c/0x1f0 [vmxnet3]
    [    6.718815]  vmxnet3_create_queues+0x267/0x360 [vmxnet3]
    [    6.718819]  vmxnet3_set_ringparam+0x243/0x320 [vmxnet3]
    [    6.718824]  ethnl_set_rings+0x1f0/0x2f0
    [    6.718829]  genl_family_rcv_msg_doit.isra.16+0x113/0x150
    [    6.718834]  genl_family_rcv_msg+0xb7/0x170
    [    6.718836]  ? rings_prepare_data+0x80/0x80
    [    6.718839]  genl_rcv_msg+0x47/0xa0
    [    6.718842]  ? __audit_sockaddr+0x5d/0x80
    [    6.718847]  ? genl_family_rcv_msg+0x170/0x170
    [    6.718849]  netlink_rcv_skb+0x54/0x110
    [    6.718853]  genl_rcv+0x24/0x40
    [    6.718855]  netlink_unicast+0x19a/0x230
    [    6.718858]  netlink_sendmsg+0x204/0x3d0
    [    6.718861]  __sock_sendmsg+0x50/0x60
    [    6.718865]  __sys_sendto+0xee/0x160
    [    6.718869]  ? syscall_trace_enter+0x1ff/0x2d0
    [    6.718877]  ? audit_reset_context.part.16+0x26a/0x2d0
    [    6.718880]  __x64_sys_sendto+0x24/0x30
    [    6.718882]  do_syscall_64+0x5b/0x1a0
    [    6.718885]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
    [    6.718888] RIP: 0033:0x7f6c74aa1e2b
    [    6.718890] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 35 69 29 00 41 89 ca 8b 00 85 c0 75 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 41 57 4d 89 c7 41 56 41 89
    [    6.718892] RSP: 002b:00007ffe4ed9f4d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [    6.718894] RAX: ffffffffffffffda RBX: 0000560c6f283ed0 RCX: 00007f6c74aa1e2b
    [    6.718895] RDX: 000000000000002c RSI: 0000560c6f283fe0 RDI: 0000000000000003
    [    6.718896] RBP: 00007ffe4ed9f570 R08: 00007f6c74d3c480 R09: 000000000000000c
    [    6.718897] R10: 0000000000000000 R11: 0000000000000246 R12: 0000560c6f283f80
    [    6.718898] R13: 0000560c6f283f70 R14: 0000000000000000 R15: 00007ffe4ed9f748
    [    6.718899] ---[ end trace 3839bd194dd76792 ]---

Cause

The kernel in the RHEL guest OS fails to allocate memory during the VMXNET3 driver's queue configuration phase. The failure originates from __alloc_pages_nodemask, as seen in the kernel warning and call trace. The process involved is ethtool, which attempts to reconfigure the VMXNET3 network adapter, triggering the issue. This memory allocation failure prevents proper initialization of the VMXNET3 receive and request queues.

Resolution

As a temporary workaround, rebooting the virtual machine may allow the kernel to reset memory states and permit the VMXNET3 driver to initialize successfully.

As the guest operating system handles both the memory allocation mechanism and VMware Tools in RHEL environments, VMware advises contacting Red Hat Enterprise Linux (RHEL) support to further investigate the issue and implement a long-term solution.