ESXi hosts fails with PSOD and "PANIC bora/vmkernel/main/dlmalloc.c:4937 - Usage error in dlmalloc"
search cancel

ESXi hosts fails with PSOD and "PANIC bora/vmkernel/main/dlmalloc.c:4937 - Usage error in dlmalloc"

book

Article ID: 323555

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX version is 4.0.x.or NSX-T version is lower than 3.2.3.
  • ESXi host may fail with a Purple Screen of Death (PSOD) and "PANIC bora/vmkernel/main/dlmalloc.c:4937 - Usage error in dlmalloc"
  • There is more than 64 Edge Transport Nodes running on the ESXi host.
  • Backtrace is similar to:
    PANIC bora/vmkernel/main/dlmalloc.c:4937 - Usage error in dlmalloc
    PanicvPanicInt@vmkernel#nover+0x327 stack: 0x453a4f89b9b8, 0x0, 0x42002d0fefe3, 0x431242001300, 0x453a4f89b8e0
    Panic_NoSave@vmkernel#nover+0x4d stack: 0x453a4f89ba10, 0x453a4f89b9d0, 0x208, 0x42002d6a9053, 0x1349
    DLM_free@vmkernel#nover+0x22d stack: 0x431242001310, 0x42002d143afa, 0x3105ded66af, 0x453a4f89baf0, 0x3105ded66af
    Heap_Free@vmkernel#nover+0xba stack: 0x3105ded66af, 0x3b, 0x0, 0x3105ded66af, 0x431242001220
    [email protected]#1.1.7.0.21487563+0x124 stack: 0x10, 0x0, 0x10, 0x0, 0x0
    VMKAPICharDevDevfsWrapIoctl@vmkernel#nover+0x87 stack: 0x5c, 0x42002ed819f8, 0x42002d116991, 0x0, 0x2
    CharDriverIoctl@vmkernel#nover+0x7d stack: 0x430e6320e6a0, 0x430e63219920, 0x430c934f0780, 0x430c934f0780, 0x453a4f89be23
    DevFSIoctl@vmkernel#nover+0xad3 stack: 0x43120661bd50, 0x430a95414c50, 0x2, 0x0, 0x43110000003b
    FSSVec_Ioctl@vmkernel#nover+0x20 stack: 0x9, 0x42002d4b9105, 0x100, 0x400, 0x3
    FSSObjectIoctlCommon@vmkernel#nover+0x60 stack: 0x100, 0x400, 0x3, 0x100, 0xa
    FSS_IoctlByFH@vmkernel#nover+0x9f stack: 0x0, 0x3105ded66af, 0x3b, 0x3105ded66af, 0x0
    UserFile_PassthroughIoctl@vmkernel#nover+0x3f stack: 0x420053c00000, 0x42002d4d7f67, 0x433478409280, 0x433478409280, 0x453a4f89f140
    UserVmfs_Ioctl@vmkernel#nover+0x27 stack: 0x453a4f89f140, 0x0, 0x453a4f89bf40, 0xe, 0x453a4f89f000
    LinuxFileDesc_Ioctl@vmkernel#nover+0x51 stack: 0x453a4f89bf40, 0x10, 0x1, 0x42002d4b4864, 0xffffffffffffffef
    User_LinuxSyscallHandler@vmkernel#nover+0x1a4 stack: 0x0, 0x0, 0x0, 0x42002d14e068, 0x10b
    gate_entry@vmkernel#nover+0x68 stack: 0x0, 0x10, 0x416a86e037, 0x3105ded66af, 0x3

Environment

VMware NSX
VMware NSX-T Data Center

Cause

This is caused by net-vdl2 command run on the ESXi host, usually run during collection of diagnostic log bundle.

Resolution

This issue is resolved in VMware NSX 4.1 and NSX-T Data Center 3.2.3, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.


Workaround
  • Ensure that there is never more than 64 Edge Transport Nodes running on the ESXi host at any time.
  • More than 64 Edge Transport Nodes per ESXi host can cause CPU/memory/storage contention, causing performance issues on the ESXi and also performance issues the Edge VMs running on the host.

Additional Information

Impact/Risks:

Downtime on the ESXi host and failover of VMs (considering HA is enabled and allowed to restart the VMs on another host in the cluster).