NPM buffer usage - NfiQueuedPktsHigh alarm

book

Article ID: 167911

calendar_today

Updated On:

Products

XOS

Issue/Introduction

This article describes the NfiQueuedPktsHigh - NPM buffer usage alarm and possible causes and areas of investigation.NPM buffer usage alarm is raised and NfiQueuedPktsHigh events appear in the log /var/log/messages

CBS# show alarms active major
...
Major:
 
 ID      Date             Source   Description
 --      ----             ------   -----------
2060     Dec 15 20:52:11  np1      NPM buffer usage
 

Dec 30 11:22:31 Xbeam cbshmonitord[3533]: [N] Violation (s=2, alarm) occurred 3 times: module:2, item:2111 (H_ID_OUT_BUF), time:"Thu Dec 30 11:22:29 2010"
Dec 30 11:22:31 Xbeam cbsalarmmond[3621]: [I] Received fault reply msg w/ 1 entries, seqnum=0
Dec 30 11:22:31 Xbeam cbsalarmmond[3621]: [I] NP2 Out of packet buffers - Reason: NfiQueuedPktsHigh (Max: 90541,Used: 79226)
Dec 30 11:22:31 Xbeam cbshmonitord[3533]: [N] Violation (s=1, no alarm) occurred: module:13, item:2905 (H_ID_MAJOR_LED_ON), time:[1293726151]"Thu Dec 30 11:22:31 2010" 

Cause

The alarm is indicating a queuing condition during global new flow setup processing or during re-classification of existing flows. The alarm is generated when the buffer for packets for New Flow Initiation reaches the threshold value for the alarm.

When an NPM receives a packet for a New Flow, it goes through the "slow path" for flow setup. The New Flow Classification process begins and packets related to that flow are placed in the referenced NFI queue buffer of the NPM receiving the packets for the new flow before the flow is set up completely.

Buffer usage can increase due to unusual traffic arriving on the interface as described below. When there is no returning traffic, buffers won't be flushed and, because the usual traffic pattern of source/destination might not fall into same hash, the buffer usage alarm will not get cleared.

Threshold for the alarm can be calculated as percentage (87.5%) of "Configured number of packets". For example, when the "Configured number of packets" is 90551, the alarm is triggered when "Current number of buffers in use" passes 79232.
 
The following 7-tuple items are taken into account for an active flow table entry:
  • source-address
  • destination-address
  • source-port
  • destination-port
  • protocol
  • domain ID
  • ingress (rx) circuit – where the destination MAC validation also occurs
 
Note: The alarm is fired to call attention to the queuing condition, but in most cases, there is no impact to the network. It is important to understand what is happening on the wire to learn if there is a legitimate reason for the alarm to occur.

Resolution

Possible causes and questions to investigate:
  1. Presence of hairpin routing.
  2. Same traffic coming and leaving through same port.
  3. Load balancer in front of Crossbeam chassis.
  4. External Tap (IDS) configuration.
  5. In a DBHA configuration, are any devices using source MAC address or MAC address learned from ARPreply (source MAC address vs. VRRP MAC address) when sending traffic?
  6. In the traffic profile, how many new connections per second?
  7. What is the TCP/UDP ratio and how much traffic is observed, especially UDP traffic?
Depending on the cause and the responses to the questions listed above, the solution may vary.

Newer versions of XOS provide an improved mechanism for managing buffers and customers may consider an upgrade. The following XOS versions include the improved buffer management scheme:
 
8.5.6
9.6.2
9.0.4
9.5.6

Workaround

The alarm is normally cleared when buffer usage drops below the configured threshold (87.5% of "Configured number of packets") or when the NPM which raised the alarm is reloaded.