Smarts IP: How to prevent domain crashes caused by trap storms; How to configure the trap processor to deal with a spike in trap rate which may result in Out of Memory error
search cancel

Smarts IP: How to prevent domain crashes caused by trap storms; How to configure the trap processor to deal with a spike in trap rate which may result in Out of Memory error

book

Article ID: 304140

calendar_today

Updated On:

Products

VMware Smart Assurance

Issue/Introduction

Symptoms:


How to prevent Smarts IP crashes due to trap storms (spike in trap rate)

There was a SNMP trap storm on the network and the Smarts IP domain crashed

SNMP_TrapsHandler [ANY:9000] Receiver
CI-F-ETHREAD-Thread SNMP_TrapsHandler [ANY:9000] Receiver(#23) threw exception
 CI-ECPPEXCEPTION-C++ standard exception:  Out of Memory

Environment

VMware Smart Assurance - SMARTS

Cause

The TrapReceiverCallback process in the IP domain gets flooded and causes TrapReceiverCallback queue to grow and consume available memory resources.  This will cause the IP domain to crash due to lack of memory resources.  

The Smarts Trap Exploder, Trap Adapter or Third Party trap forwarder, is forwarding a trap flood to the IP domain TrapReceiverCallback process..  

Please Note: The settings in the trapd.conf file in the IP domain Manager Installation do not apply in the IP Manager Domain so the trap flood must be controlled outside of the IP Domain Manager .
 

Resolution

Please note:  This resolution applies to the SAM Trap Exploder / Trap Adapter domain that is forwarding traps to the IP domain.

The immediate solution/workaround to avoid a crash is to set up trap queue limits so that the queues do not grow beyond a configurable size. 

If a trap spike is experienced while queue limits are set, incoming traps will be dropped when the queue limits have been reached to accommodate more traps. 

Filtering can either be set in trapd.conf (restart required) or via dmctl commands (no restart required but these settings will not persist across a domain restart).

1.) trapd.conf method:

<BASEDIR>/smarts/bin/sm_edit conf/trapd/trapd.conf

Uncomment the below lines and add a value :

#QUEUE_LIMIT_MEGS: 0
#QUEUE_LIMIT_SECONDS: 0

New values:

QUEUE_LIMIT_MEGS: 450

QUEUE_LIMIT_SECONDS: 60

Restart Smarts IP domain.


2.) DMCTL method:

Get Trap Manager instance name:
dmctl -s <Domain> getI SNMP_TrapManager

Add queue limits to the instance:
dmctl -s <Domain> put SNMP_TrapManager::<Trap_Manager_Instance>::MegabytesInQueue 450
dmctl -s <Domain> put SNMP_TrapManager::<Trap_Manager_Instance>::SecondsInQueue 60

 


Additional Information

Please note:  This resolution applies to the SAM Trap Exploder / Trap Adapter domain, only.

The Trap ReceiverCallback process in the IP domain does not utilize the trapd.conf file that is present in the IP domain installation.  The issue will still exist to cause the IP domain to crash if the IP domain is subjected to a trap flood.