High usage of allocated resources leads to unknown status for NSX T Edge
VMware NSX T 3.2.x
Scaled up logical routers with high rates of traffic and a high number of firewall connections may result in an "unknown" status for the edge within the NSX UI.
The edge syslog file may contain log entries similar to the following:
SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked xxxx ms waiting for dp-ipc31 to quiesce
A high number of firewall connections may result in a condition where the dp-ipc thread is unable to run efficiently and ends up impacting the functioning of the edge.
Use the following command to determine the number of firewall connections:
#edge-appctl -t /var/run/vmware/edge/dpd.ctl fw/show connections > connections.txt
#less connections.txt
Example of a high number of firewall connections for NSX T 3.2.x would be around 125k
This high traffic condition impacts periodic tasks such as state purging. These tasks could be taking more time to complete, preventing the dp-ipc thread from getting a chance to finish the configuration tasks
Increase the number of edges in the cluster. The addition of more edges will spread the traffic to multiple edges resulting reduced traffic load on each edge.