NSX manager may show an alarm for "Edge Memory Usage Very High" with similar details as "The memory usage on Edge node <UUID> has reach 93% which is at or above the very high threshold value of 90%" randomly.
Accompanying to the same alarm, there might be another alarm for "Mangement Channel on <NSX-Manager-Node> to Transport Node <NSX-Edege-Node> (NSX-Edge-IP) is down for 5 minutes" and NSX edge in question may have failed over its active role to a standby edge node when in an active/standby HA configuration.
NSX IDPS has rules that inspect SMB traffics either explicitly or implicitly.
During the alarm period, there is a high throughput of SMB traffic.
VMware NSX 4.2.x
VMware VCF 9.0
High volume of SMB traffic/inspection causes IDPS to consume large amount of memory.
To resolve this issue, please open a Broadcom Support Requesting and upload the following required data/logs:
Start Heap Profiling:
This command initiates the heap profiling, using /var/log/dp_heap as the base name for the output files.
# edge-appctl -t /var/run/vmware/edge/dpd.ctl heap_profile/start /var/log/dp_heap
Take Initial Snapshot (Mark Point 1):
This command captures the current heap state at the beginning of our observation period.
# edge-appctl -t /var/run/vmware/edge/dpd.ctl heap_profile/dump test0
Allow Collection to Run (Wait Period):
Allow the heap profiling to run in the background for several hours, ideally overnight. During this time, no further commands related to heap profiling should be executed, and the dpd process should remain running.
Important: If the dpd process restarts or an Out-Of-Memory (OOM) kill occurs during this collection period, you will need to restart the entire collection process from Step 2.
Take Final Snapshot (Mark Point 2):
After the desired collection period, take a second snapshot of the heap to capture its state at the end of the observation.
# edge-appctl -t /var/run/vmware/edge/dpd.ctl heap_profile/dump test1
Stop Heap Profiling:
This command gracefully stops the heap profiling process.
# edge-appctl -t /var/run/vmware/edge/dpd.ctl heap_profile/stop
Verify Profiling is Stopped:
Confirm that the profiling has successfully stopped by checking its state. The output should show "state": "stopped".
bash # edge-appctl -t /var/run/vmware/edge/dpd.ctl heap_profile/state
Expected output: json { "state": "stopped"}
Locating the Data Files:
After completing these steps, you should find two heap profile files in the /var/log/ directory:
/var/log/dp_heap.0001.heap
/var/log/dp_heap.0002.heap