Title: Alarm for Edge Datapath CPU usage high Event ID: edge_health.edge_datapath_cpu_high Alarm Description
Purpose: Indicates Edge Datapath CPU usage is high
Impact: Rx drops will be observed when usage reaches 100%
Environment
VMware NSX-T Data Center
Edge Form factors:
Bare Metal Edge
VM Edge
Cause
Reason for high CPU usage:
Current CPU usage on the Edge node can be obtained by invoking the 'get dataplane cpu stats' Edge CLI which shows packets per second per CPU core and the CPU utilization. High CPU usage implies you have reached the maximum capacity for one or all CPUs.
CPU usage increases when there are large number of fragmented packets. Checking for MTU size along the path and adjusting the packet size can help reduce fragmentation.
The number of fragmented packets on the Logical router interface can be obtaining using 'get logical-router interface <Logical router interface UUID> stats' Edge CLI. Logical router interface UUID is obtained using 'get interface' Edge CLI under the Logical router VRF.
Resolution
Steps to Resolve For 3.0.0 and higher
Recommended Action:
Consider increasing the Edge appliance form factor size and rebalancing services on this Edge node to other Edge nodes in the same cluster or other Edge clusters.
Higher CPU usage is expected with higher packet rates. On the Edge node if the packet rate is low while cpu usage is high then check if flow-cache is disabled by invoking 'get dataplane flow-cache config' Edge CLI. If it is disabled, then consider re-enabling it using the command 'set dataplane flow-cache enabled' followed by 'restart service dataplane' Note: This command will cause momentary disruption in traffic.