malloc_heap memory exhausted alarms observed on small form factor edge nodes
search cancel

malloc_heap memory exhausted alarms observed on small form factor edge nodes

book

Article ID: 382699

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Datapath memory usage may go high due to malloc_heap memory getting exhausted on small form factor of edge nodes from 4.1 version of NSX-T after increasing the Tx/Rx ring-size.
  • Following logs are observed in syslog along with alarm in the NSX manager UI:

    2024-11-11T17:31:42.589Z edge-1 NSX 3525209 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="stats" tname="stats13" level="ERROR" eventId="vmwNSXEdgeMpoolUsage"] {"event_state":96,"event_external_reason":"Edge datapath memory usage is over 85%","event_src_comp_id":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx","event_sources":{"pool_name":"malloc_heap"}}

    2024-11-11T16:08:22.005Z edge-1 NSX 3479158 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="stats" level="INFO"] malloc_heap memory exhausted, usage: 96, threshold: 85

Environment

VMware NSX

Cause

  • The DPDK was upgraded and some new features were introduced in version 4.1 which increased the memory usage.
  • Small form factor edge has 1G heap available for datapath, thus if TX/RX ring size is increased over default value of 2048, malloc_heap can potentially have higher usage leading to alarm.

Resolution

This is an expected behavior.

Workaround:

  • Change back TX/RX ring size to default 2048 using the following commands:

    set dataplane ring-size rx 2048
    set dataplane ring-size tx 2048
    restart service dataplane
  • Disable NICs not participating in traffic forwarding.

NOTE: Restarting dataplane service will result in traffic disruption.