NSX Edge Node CPU usage is high due to too many Firewall rules
search cancel

NSX Edge Node CPU usage is high due to too many Firewall rules

book

Article ID: 374139

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

NSX Edge Node CPU shows more than 90% utilization and /var/log/syslog file in edge node shows the below messages 

2024-07-26T#########442Z ####-####-####-####-####-NSX 4869 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="edge_datapath_cpu_very_high" eventSev="critical" eventState="On"] The datapath CPU usage on Edge node ####-######-######-#### has reached 96.47% which is at or above the very high threshold for at least two minutes.

2024-07-26T#########163Z ####-####-####-####-#### NSX 4869 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="edge_datapath_cpu_very_high" eventSev="critical" eventState="Off"] The CPU usage on Edge node ####-######-######-#### has reached below the very high threshold.

Further, dp-ipc blocked messages are seen in syslog file 

2024-07-26T#########665Z #### NSX 9697 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="dp-si-purge5" level="WARN"] blocked 4000 ms waiting for dp-ipc43 to quiesce
2024-07-26T#########526Z #### NSX 9697 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 32000 ms waiting for dp-ipc43 to quiesce
2024-07-26T#########562Z #### NSX 9697 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="dp-si-purge5" level="WARN"] blocked 32000 ms waiting for dp-ipc43 to quiesce
2024-07-26T#########526Z #### NSX 9697 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 64000 ms waiting for dp-ipc43 to quiesce
2024-07-26T#########526Z #### NSX 9697 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 128000 ms waiting for dp-ipc43 to quiesce

Environment

VMware NSX

VMware NSX-T Data Center 3.x and later

VMware NSX 4.x

 

Cause

More than 5,000 L3 firewall rules causes high CPU usage (up to 100%) on NSX Edge nodes

  • Login to Edge Node
  • Run the command cat /var/log/syslog* | grep -i got | grep -i rules

As per below sample output, there are 8528 L3 rules (> 5000)

2024-07-26T#########457Z ######-######-#### NSX 9697 FIREWALL [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="firewallcp" tname="dp-ipc43" level="INFO"] Got 8258 l3 rules and 0 l2 rules (ignored), cmd count 1989 for vif ######-######-####
2024-07-26T#########185Z ######-######-#### NSX 9697 FIREWALL [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="firewallcp" tname="dp-ipc43" level="INFO"] Got 8258 l3 rules and 0 l2 rules (ignored), cmd count 1989 for vif ######-######-####
2024-07-26T#########105Z ######-######-#### NSX 9697 FIREWALL [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="firewallcp" tname="dp-ipc43" level="INFO"] Got 8258 l3 rules and 0 l2 rules (ignored), cmd count 1989 for vif ######-######-####
2024-07-26T#########967Z ######-######-#### NSX 9697 FIREWALL [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="firewallcp" tname="dp-ipc43" level="INFO"] Got 8258 l3 rules and 0 l2 rules (ignored), cmd count 1989 for vif ######-######-####
2024-07-26T#########288Z ######-######-#### NSX 9697 FIREWALL [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="firewallcp" tname="dp-ipc43" level="INFO"] Got 8258 l3 rules and 0 l2 rules (ignored), cmd count 1989 for vif ######-######-####

Resolution

Reduce the number of rules to 5000

NSX config max reference: VMware Configuration Maximums