We are seeing "A TRAP STORM HAS BEEN DETECTED" alarm frequently generated on some devices.
How does Spectrum determine a trap storm has been detected?
How do you configure the Spectrum trap storm settings?
How do you allow for more traps than the default configuration before trap storm functionality is triggered?
All supported Network Observability DX NetOps Spectrum releases
The CA Spectrum Trap Management Subview section of the VNM Attributes in the Information Tab documentation topic explains how Spectrum Trap Storm detection works.
It does not explain the underlying logic used by the code to determine if it should, or should not, trigger the trap storm functionality.
We can enable trap storm detection at the SpectroSERVER level, or at the level of a specific modeled device.
Enable trap storm detection at either level by configuring the following attributes which control this functionality.
Attribute Name: traps_per_sec_storm_threshold
Attribute ID: 0x122db
Attribute Definition: Defines the rate at which traps are received per second from a managed or unmanaged device. When this rate is sustained for the amount of time that is specified by the TrapStormLength attribute, the SpectroSERVER stops the processing of traps from that unmanaged or managed device.
Default Value: 20 traps per second
When trap storm detection is enabled, using default configurations, when Spectrum receives >= 100 traps over a 5 second period it will trigger the functionality.
When traps received from any device reach the configured thresholds, the SpectroSERVER identifies this rate as a trap storm. The SpectroSERVER stops handling traps from that device and traps from other devices are not blocked. SpectroSERVER trap storm detection logic is based on each IP address of an unmanaged or a managed device (trap source) that sends traps to SpectroSERVER. As a result, you can configure each device to send traps to the SpectroSERVER at the appropriate rate."
An important point to keep in mind is the word "rate" from the above details. The underlying formula Spectrum uses to determine if there is a trap storm is as follows:
in_storm = ( sum/TrapStormLength >= trap_storm_size ) ? TRUE : FALSE;
The "sum" is the number of traps received over a time period. Using the above formula above and the default values for traps_per_sec_storm_threshold and TrapStormLength, if the device received 100 (sum) traps in 3 seconds, the calculation would be as follows:
100/5 >= 20
In the above scenario, even though the sample of traps was received over a 3 second period, according to the formula used, the average number of traps is equal to or exceeds 20 traps per second over a 5 second period so Spectrum will detect a trap storm, assert an alarm and stop processing traps for that device until the rate falls below the configured parameters.
When 100 or more traps arrive in the specified time frame, it can be any amount of traps per second over the TrapStormLength time frame. Using the default 5 seconds we can break the TrapStormLength down like this.
Second1 . . . . Second2 . . . . Second3 . . . . Second4 . . . . Second5If we see this series of 100 traps in that 5 second period, the functionality will be triggered. This would raise the functionality.
1 . . . . 96 . . . . 1 . . . . 1 . . . . 1This would also trigger the functionality.
54 . . . . 14 . . . . 17 . . . . 10 . . . . 5This would NOT trigger the functionality due to 99 traps coming in over 4 seconds within the 5 second rolling window.
96 . . . . 1 . . . . 1 . . . . 1 . . . . 0
If the feature is enabled, if the environment has devices sending trap counts greater than the default configurations, adjust the attributes to ensure trap storm detection does not limit the ability to receive traps.