NSX-T Edge Dataplane memory exhaustion due to IDPS or L7 (DNS profile) configured on the Gateway Firewall
search cancel

NSX-T Edge Dataplane memory exhaustion due to IDPS or L7 (DNS profile) configured on the Gateway Firewall

book

Article ID: 318306

calendar_today

Updated On:

Products

VMware NSX VMware vDefend Firewall

Issue/Introduction

  • You have configured IDPS and/or L7 (DNS profile) rule on a Gateway Firewall.
  • The NSX-T UI displays an alarm "Edge DataPath memory usage is over 85%” and "Edge DataPath memory usage is 100%” messages.
  • You may see the errors in the edge node logs:
    /var/log/nsx-event.log
    2023-02-12T17:12:13.281Z edge01 NSX 5229 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="stats" tname="stats22" level="ERROR" eventId="vmwNSXEdgeMpoolUsage"] {"event_state":100,"event_external_reason":"Edge datapath memory usage is over 85%","event_src_comp_id":"<src_id>","event_sources":{"pool_name":"pfdnsdnpl"}} 
  • You may see similar error entries in the NSX Edge node log file /var/log/syslog
    2023-02-12T17:12:13.281Z edge01 datapath-systemd-helper 5121 - - 2023-02-12T17:12:13.281Z datapathd 5229 firewalldp [ERROR] pf_set_state_dns_transaction_id: fqdn, failed to malloc dns trans on empty list
    2023-02-12T17:14:13.281Z edge01 datapath-systemd-helper 5121 - - 2023-02-12T17:14:13.281Z datapathd 5229 firewalldp [ERROR] pf_set_state_dns_qname: fqdn failed to find transaction with id 0
  • If you run the command 'edge-appctl -t /var/run/vmware/edge/dpd.ctl mempool/show' as the root user on the edge node, you will see the available_entries is 0 and the in_use count is not decrementing.
root@edge01:/# edge-appctl -t /var/run/vmware/edge/dpd.ctl mempool/show
   "available_entries": 0,
        "available_entries_in_cache": 0,
        "cache_size_per_core": 512,
        "description": "Stateful Service FQDN DNS Pool",
        "elt_size": 104,
        "flags": 16,
        "header_size": 64,
        "in_use_count": 524288,
  • You may experience DataPath issue with DNS traffic when this alert occurs.

NOTE: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.

Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x
VMware vDefend Firewall

Cause

The issue occurs due to a memory leak, which can occur under the following conditions:
  • When both IDPS and L7, is enabled packets are sent to both, this can result out of memory issue.
  • When IDPS is disabled and L7 is enabled, packets are sent to the dnsparser, this can result out of memory issue.
  • When IDPS is enabled and L7 is disabled, packets are sent to both, this can result out of memory issue.

Resolution

This issue is resolved in VMware NSX-T version 3.2.3 available at Broadcom Downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround:
  • Disable the DNS layer 7 rule in the Gateway Firewall.
  • As admin user on the NSX-T Edge Node, restart the datapathd process using the following command, this will clear the memory pool:
    edge01> restart service dataplane
Note: When restarting the data plane service, no traffic will be processed by the edge node, either move services to another edge node before restarting the service or plan this for a maintenance window if you wish to avoid data plane impact.