Bare-Metal Edge node reporting 'mempool exhausted' alarm for 'mbuf_pool_socket_0' but there is no dataplane impact for traffic flowing through the edge node

search cancel

Bare-Metal Edge node reporting 'mempool exhausted' alarm for 'mbuf_pool_socket_0' but there is no dataplane impact for traffic flowing through the edge node

book

Article ID: 379555

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Alarm Dashboard reporting "Mempool exhausted' alarms for "mbuf_pool_socket_0" on Bare Metal Edge nodes.

ex: /var/log/syslog of the Edge node

2024-09-20T02:45:12.419Z edge-01.corp.local NSX 5307 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="stats" level="INFO"] mempool exhausted, usage: 87, threshold: 85, pool: mbuf_pool_socket_0

By checking the utilization of the mbuf_pool_socket_0, high consumption of this mempool can be noticed,

edge01> get dataplane memory stats

----

Available_entries : 133356 >> approx. 13.64% of 'size'. That means 86.36% is 'used'
Available_entries_in_cache : 2830
Cache_size_per_core : 128
Name : mbuf_pool_socket_0
Per_core_cache

-------O/P Truncked--------

Size : 977146 >> total size

----

A history of the number of available entries for "mbuf_pool_socket_0" can be retrieved from syslog. Although this value (the same as Available_entries from the above output) remains low in comparison with the total size of the mempool (Size from the above output,) indicating high consumption, the value of available entries varies little over time. Such small increases and decreases indicate normal allocation and deallocation, so there is no indication of a memory leak.

ex: /var/log/syslog of the Edge node

$ grep Mempool syslog* | awk '{print $1, $14}'
syslog.1:2024-09-20T02:41:39.056Z 133360
syslog.11:2024-09-20T01:31:39.056Z 133358
syslog.12:2024-09-20T01:21:39.047Z 133357
syslog.14:2024-09-20T01:11:39.050Z 133359
syslog.15:2024-09-20T01:01:39.050Z 133358
syslog.17:2024-09-20T00:51:39.056Z 133356
syslog.18:2024-09-20T00:41:39.056Z 133356

The Bare-Metal Edge node may have been configured with Rx and Tx size of 4096 for 'dataplane' service.

edge01> get dataplane | find Rx_ring_size
Rx_ring_size : 4096

edge01> get dataplane | find Tx_ring_size
Tx_ring_size : 4096

Environment

VMware NSX

Cause

On Large Bare Metal Edge nodes the number of mbufs in Tx and Rx queues represents the majority of all mbufs allocated from the pool. If the buffers in Tx, Rx queues are not freed before the space in the Tx , Rx rings are needed, this may lead to memory usage alerts.

Resolution

This issue has been addressed from 4.2.0 onwards where the size of the mbuf pool have been increased for Bare Metal edge nodes to avoid this alarm.

Feedback

thumb_up Yes

thumb_down No