Massive Issues with event processing since update to version 10.4.2.2

book

Article ID: 220906

calendar_today

Updated On:

Products

CA Spectrum

Issue/Introduction


Since the update to Spectrum 10.4.2.2, we have massive problems with the event processing in Spectrum. The alert views in Spectrum are not correct any more so the whole alerting in Spectrum is unreliable!! False alarms are just annoying, but we are mostly afraid of lost alarms
 
Some examples are:
 
1. Link condition is suppressed with oper status=up. We see this more than 1300 ports.
 1. Event 0x10d66 occurs many thousand times, but the interfaces are always up. 
 2. Event 0x10d67 (started responding to polls) is missing on these ports.  This should not be the case according to https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=806627 .
 3. Even if event 0x10d10 (status good) occurs, the interface condition very often remains suppressed
2. Similar behavior for application models: Many "stopped responding to polls (0x10d09)" followed by a "started responding to polls (0x10d0b)" events, although the SNMP agent is contactable. 
3. We also see an alarm “device has been rebooted” without any evidence of a reboot. The uptime of the device is many weeks!
4. Since the Spectrum update we also see “Link Aggregation Condition is critical” alarms, without any port down.

You can see that there a lot of ports suppressed.

 

Cause

From the output of a "netstat -suna" we could see many UDP packet errors being received, this often means that the UDP traffic is arriving too fast for the network buffer to handle it and the UDP packets are being dropped.

Environment

Release : 10.4.2.2.

Component : Spectrum Core / SpectroSERVER

RedHat Linux

Resolution

 (1) Increased the DCM timeout value of all devices  to  6 sec 
 (2) Updated the sysctl socket buffers configuration to 8MB using below commands .
  
       sysctl -w net.core.rmem_default=8388608
       sysctl -w net.core.wmem_default=8388608
       sysctl -w net.core.rmem_max=8388608
       sysctl -w net.core.wmem_max=8388608
       sysctl -w net.core.netdev_max_backlog=2000
       
       Customer can make these changes permanent by adding the above configuration entries in /etc/sysctl.conf.  Spectrum version upgrade scripts will take care of these configurations in future spectrum releases