DFW Memory usage is very high
search cancel

DFW Memory usage is very high

book

Article ID: 395492

calendar_today

Updated On:

Products

VMware NSX VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

  • Users observe an alarm in the NSX Manager UI indicating "DFW Memory usage is very high" for one or more ESXi hosts.
  • In the var/log/syslog of the NSX Manager
2025-xx-xxTxx:xx:xx.xxxZ <nsx-manager>.<your domain> NSX 75699 MONITORING [nsx@6876 alarmId="#########-####-####-####-############" alarmState="OPEN" comp="nsx-manager" entId="321f0000-0000-0000-0000-000000000000" errorCode="MP701099" eventFeatureName="distributed_firewall" eventSev="CRITICAL" eventState="On" eventType="dfw_memory_usage_very_high" level="FATAL" nodeId="#########-####-####-####-############" subcomp="monitoring"] The DFW Memory usage vsip-state on Transport node #########-####-####-####-############ has reached 9x% which is at or above the very high threshold value of 75%.
  •     From the ESXi log bundle commands/vsipioctl_info.sh.txt under "/bin/vsipioctl getmeminfo", users would see significant allocation failures represented by counter "numFail". Please note that the "inuse" in the below example shows close to 2 million states, but this may not always be the case, had the offending VMs migrated off the ESXi host.
     /bin/vsipioctl getmeminfo
     Heap: vsip-module, max 2560 MB
     <snip>
     
     Heap: vsip-state, max 512 MB
         zone 2: pfstatepl maxObj = 2000000, objSize = 624, alloc = 238361714, free = 236362756, inUse = 1998958, numFail = 15681698, totalMem = 1247349792 <<<<<<<<<<<<<<<<<<<<<<<
       <Snip>
  •     From the ESXi log bundle commands/vsipioctl_info.sh.txt for one or more VMs under "/bin/vsipioctl getfilterstat -f <nic-XXXXX-ethX-vmware-sfw.2>", the user would see huge packet drops due to memory
        
         Here is an example:
       /bin/vsipioctl getfilterstat -f nic-xxxxx-eth0-vmware-sfw.2
       PACKETS IN OUT
       ------- -- ---
       <snip>
       
       BYTES IN OUT
       ----- -- ---
         <snip>

       DROP REASON
       -----------
         memory: 185####       <<<<< Packet drops with the drop reason as memory.

    <snip>

Cause

The pfstatepl pool tracks the state of active firewall connections (session entries) in the ESXi host’s Distributed Firewall (DFW). The maximum number of concurrent connections supported on an ESXi host is 2 million.

Resolution

The issue here will be specific to the environment. Some portscan VMs or compromised VMs may be holding hundreds of thousands of connections. The recommendation is to check why some VMs are generating a high number of connections and then decide on the mitigation strategies, like blocking the traffic, enabling flood protection, excluding the VMs from DFW, etc.
    
    Refer to https://knowledge.broadcom.com/external/article/319109 resolution section for other possible workarounds.

Additional Information

    How to identify a VM that had a large number of connections:
        1. Run "currDate=$(date +%Y-%m-%d_%H-%M-%S); /usr/lib/vmware/vm-support/bin/vsipioctl_info.sh >> /var/run/log/$currDate.vsipioctl_info.txt;" command a few times on the ESXi experiencing the issue to get a few samples of vsipioctl_info.txt file. (Note: Ignore the errors)

Example:
[root@ESXi:~] currDate=$(date +%Y-%m-%d_%H-%M-%S); /usr/lib/vmware/vm-support/bin/vsipioctl_info.sh >> /var/run/log/$currDate.vsipioctl_info.txt;
                ERROR: could not read port number PortNum      <<< Ignore the error/errors
                
            
[root@ESXi:~] ls -lth /var/run/log/*vsipioctl_info.txt
-rw-r--r--    1 root     root      216.0K May  2 15:07 /var/run/log/2025-05-02_15-06-57.vsipioctl_info.txt
-rw-r--r--    1 root     root      210.6K May  2 15:06 /var/run/log/2025-05-02_15-05-56.vsipioctl_info.txt

        2. Identify the network adapter that had large number of concurrent connections, using a command similar to the one below.
In this example the first 2 network adapter stands out with greater than 500 thousand connections. 

Example:
[root@ESXi:~] less /var/run/log/2025-05-02_15-06-57.vsipioctl_info.txt | grep -E "Filter Name|High Water Mark Conn" | grep High -B 1 | awk '{if (NR % 2 == 1) {sub(/^.*: /, "", $0); filter = $0} else {sub(/^.*: /, "", $0); highwater = $0; print filter "\t" highwater}}' | sort -nrk 2 | head -n 5

              nic-#####1-eth0-vmware-sfw.2   569382          <<<<<
              nic-######-eth0-vmware-sfw.2   563140          <<<<
              nic-######-eth0-vmware-sfw.2   6332
              nic-######-eth0-vmware-sfw.2   8116
              nic-######-eth0-vmware-sfw.2   7143

                
        3. Identify the VM name associated with the network adapter, by grepping the nic-#######-eth0-vmware-sfw.2 in the summarize-dvfilter output
Example:
[root@ESXi:~] summarize-dvfilter | grep nic-#####1-eth0-vmware-sfw.2 -B 2
                 port 67108884 Test-VM-A                  <<<< VM name will be displayed here
                  vNic slot 2
                   name: nic-#####1-eth0-vmware-sfw.2