Edge datapath memory usage alerts on NSX Edge node when L7 Access Profile is configured on T1 Gateway Firewall
search cancel

Edge datapath memory usage alerts on NSX Edge node when L7 Access Profile is configured on T1 Gateway Firewall

book

Article ID: 319045

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The NSX UI displays alarms "Edge datapath memory usage is over 85%” and "Edge datapath memory usage is 100%” messages for an Edge node.
  • This issue occurs with the following configuration:
    • URL Database for the Edge cluster is Enabled under Security -> General Settings -> URL Database
    • A new policy (rule section) is created under Security -> Gateway Firewall -> Gateway Specific Rules -> Select Tier-1 GW
      • An 'Allow' rule is added in this new policy with the default L7 Access Profile
      • The new policy is above the default policy
  • Running the command edge-appctl -t /var/run/vmware/edge/dpd.ctl mempool/show | grep -B 5 pffqdnsyncpl as the root user on the affected Edge node shows the available_entries are 0 and the in_use count is not decrementing for pffqdnsyncpl:

    "available_entries": 0,
    "available_entries_in_cache": 0,
    "cache_size_per_core": 512,
    "description": "Stateful Service FQDN SYNC Pool",
    "in_use_count": 524288,
    "name": "pffqdnsyncpl",
    "size": 524288,
    "socket_id": 0},

  • You see messages similar to the following in /var/log/syslog on the Edge node:

    20##-##-##T##:##:##.##xZ <Edge hostname> b174b10aa4d5 4646 - -  2023-07-11T17:14:00Z datapathd 5891 stats tname="stats23" [ERROR] {"event_state":100,"event_external_reason":"Edge datapath memory usage is over 85%","event_src_comp_id
    ":"<UUID>","event_sources":{"pool_name":"pffqdnsyncpl"}}  eventId="vmwNSXEdgeMpoolUsage"
    20##-##-##T##:##:##.##xZ <Edge hostname> datapath-systemd-helper 5782 - -  2023-07-11T17:14:00Z datapathd 5891 stats tname="stats23" [ERROR] {"event_state":100,"event_external_reason":"Edge datapath memory usage is over 85%","event_src_comp_id":"<UUID>","event_sources":{"pool_name":"pffqdnsyncpl"}}  eventId="vmwNSXEdgeMpoolUsage"

Cause

When a DNS packet is received on the Active Edge node, that Edge processes the DNS response record, and syncs the FQDN IP key and domain to the Standby Edge node. pffqdnsyncpl occupies memory which is not cleared, and eventually the mempool is exhausted.

Resolution

This issue is resolved in VMware NSX 4.1.1
This issue is resolved in VMware NSX 4.2.0

Workaround:

  1. If possible, disable the layer 7 Access Profile rule(s) in the T1 Gateway Firewall.

    OR

  2. As admin user on the affected Edge Node, restart the datapathd process using the following command, which will clear the memory pool:

    edge> restart service dataplane

    Note: When restarting the dataplane service, no traffic will be processed by the Edge node for a short time. Either move services to another Edge node before restarting the service, or plan this activity for a maintenance window if you wish to avoid all dataplane impact. 

Additional Information