NPM8200 internal switches oversubscribed

book

Article ID: 168041

calendar_today

Updated On:

Products

XOS

Issue/Introduction

NPM8200 discards packets because the internal switches are overloadedIf a customer detects silent packet loss when traffic passes through the chassis and sees interface input errors, the  internal NPM switch known as GMAC may be overloaded.

Cause

Use the following procedure to detect whether NPM8200 internal switches are overloaded.

- First check cables, SFPs, ports, and switch
- Check if NPM drops traffic (see solution #806)
- Check if firewall drops packets (fw ctl zdebug drop)
- Check if interface errors continue growing:

CBS# clear interface gigiabitethernet 1/1
CBS# show interface gigabitethernet 1/1
Gigabitethernet 1/1 is up
Hardware address is 00:03:d2:11:b2:26
MTU 1500 bytes, BW 1 Gigabit, full-duplex, auto-negotiation is enabled
Last clearing of "show interface" counters never
1867526051936 packets input, 402280544082509 bytes
Received 300591297 broadcasts, 0 runts, 0 giants, 0 throttles
1246691127 input errors, 751 CRC, 0 frame, 0 overrun, 0 ignored
1858736280060 packets output, 400555380997905 bytes, 0 underruns
0 output errors, 0 collisions

Resolution

Each NPM8200 has 2 internal switches, which are responsible for the following physical ports:

GMAC-A: 1/1, 1/2, 1/3 and 1/4
GMAC-B: 1/5, 1/6, 1/7 and 1/8

Each GMAC can handle 1GB, but each physical port has 1GB of capacity. The easiest way to detect this issue is through the swatch utility. In the output below we can see a huge amount of RxErrors in first ports (1/1 and 1/3). If the sum of  the "Rx Data Rate" column for these interfaces is more than 1GB, the GMAC-A is overloaded and internal switch will drop packet silently.

To avoid that, customer should distribute the traffic using MLT to higher ports (1/5, 1/6, 1/7, 1/8) or NPMs. This is the same for GMAC-B.

# cd /crossbeam/bin
# ./swatch npmdevstats

NPM interface statistics

Interface Stat TxPkts TxDrops TxErrors RxPkts RxDrops RxErrors
----------- ---- --------------- ---------- ---------- --------------- ---------- ----------
GigaEth1/1 up 1912787755113 0 0 1930684093933 0 1518337178
GigaEth1/2 down 91447624483 0 0 87914416626 0 295970849
GigaEth1/3 up 171344638002 0 0 175202263631 0 61500687
GigaEth1/4 down 0 0 0 0 0 0
GigaEth1/5 up 452635621 0 0 623733354 0 0
GigaEth1/6 up 46410559510 0 0 46265931742 0 0
GigaEth1/7 up 6469006938 0 0 4577194295 0 0
GigaEth1/8 up 39994450908 0 0 39331594210 0 0

(press '+' twice to go to "NPM Data Rate Statistics" view)

NPM Data Rate Statistics

Tx Data Tx Data Rx Data Rx Data
Rate Rate Peak Rate Rate Peak
Interface Stat (Mbps) (Mbps) (Mbps) (Mbps)
----------- ------ ----------- ------------- ------------ ------------
GigaEth1/1 up 938.761 1071.725 1073.066 1272.428
GigaEth1/2 down 0.000 0.000 0.000 0.000
GigaEth1/3 up 358.851 377.980 290.932 391.334
GigaEth1/4 down 0.000 0.000 0.000 0.000
GigaEth1/5 up 0.395 0.848 0.028 2.866
GigaEth1/6 up 15.282 124.564 59.108 104.106
GigaEth1/7 up 954.439 1016.336 835.481 930.816
GigaEth1/8 up 33.667 53.919 42.710 63.446


In the above example, 1/1 and 1/3 are using more than 1GB together. This is causing the errors in the NPM and silent packet drops. This is an NPM8200 hardware limitation that is not present in the NPM86x0 family.

Crossbeam recommends splitting the traffic between both GMACs or more than one NPM8200, or swap the NPM8200 for an NPM86x0.

If NPM8200 internal commands are needed to corroborate this, please contact Customer Support for further troubleshooting.

Workaround

N/A