SpectroSERVER CPU at constant 100%

book

Article ID: 182960

calendar_today

Updated On:

Products

CA Spectrum CA eHealth

Issue/Introduction

The SpectroSERVER process is constantly using 100% CPU  (100% of 1 processor). This causes slowness
   when navigating the landscape and seems to also be very sluggish.

Cause


The SpectroSERVER was being flooded with SNMPv3 traps from one specific device. The volume of traps
  was such that the SpectroSERVER was continually processing trying to keep up with the volume.

Environment

Release : 10.3.x, 10.4.x, 20.x

Component : SpectroSERVER

Resolution

Reconfigure the identified device to only send traps that are needed and stop the
  continual flood that was occurring. Once done the SpectroSERVER CPU utilization
  returned to normal.

Additional Information



Symptoms:


If devices are also being polled via SNMPv3 you may see continual messages like this in the VNM.OUT

VNM.OUT
--------------------------------
Oct 01 09:45:13 WARNING at CsHPSERequestSender.cc(569): SNMP send took 1024ms! Device IP: 10.xxx.xxx.xxx Request ID: 21227 Operation Type: 0 Community: #v3/P:SHA^authpassword#wq:AES^privpass/snmpUser
 SendRetCode: 0 usReturnCode: 0 Timeout: 60000 Device Port: 161 m_strCommunity: WDCS
 m_nVersion: 3 m_strContextName:  m_nSecLevel: 3 m_nAuthProtocol: 2
 m_strAuthPwd: authpassword m_nPrivProtocol: 2 m_strPrivPwd: privpassword
Oct 01 09:45:15 WARNING at CsHPSERequestSender.cc(569): SNMP send took 1566ms! Device IP: 10.xxx.xxx.xxx Request ID: 21228 Operation Type: 0 Community: #v3/P:SHA^authpassword#wq:AES^privpass/snmpUser
 SendRetCode: 0 usReturnCode: 0 Timeout: 60000 Device Port: 161 m_strCommunity: WDCS
 m_nVersion: 3 m_strContextName:  m_nSecLevel: 3 m_nAuthProtocol: 2
 m_strAuthPwd: authpassword m_nPrivProtocol: 2 m_strPrivPwd: privpassword
Oct 01 09:45:16 WARNING at CsHPSERequestSender.cc(569): SNMP send took 1018ms! Device IP: 10.xxx.xxx.xxx Request ID: 20995 Operation Type: 0 Community: #v3/P:SHA^authpassword#wq:AES^privpass/snmpUser
 SendRetCode: 0 usReturnCode: 0 Timeout: 60000 Device Port: 161 m_strCommunity: wdcs
 m_nVersion: 3 m_strContextName:  m_nSecLevel: 3 m_nAuthProtocol: 2
 m_strAuthPwd: authpassword m_nPrivProtocol: 1 m_strPrivPwd: privpassword
Oct 01 09:45:16 WARNING at SnmpItcInterface.cc(156): SNMP failsafe - Retrying request( IP=172.xxx.xxx.xxx, community=theOffice, id=20863) after 360000ms.
 
 
 
 
 
How this was identified:

Initially, we used the SysInternals Procdump to generate a series of dmp files to analyze. The interesting thing
  in this case was that the dmp files showed the SpectroSERVERs moot thread to be sitting idle.

ntdll!NtWaitForSingleObject+0xa
KERNELBASE!WaitForSingleObjectEx
libmoot!CsMoot::suspend
libmoot!CsMootBase
libmoot!spin
libmoot!call_func
kernel32!CreateFiberEx

Using SysInternals Process Explorer utility, we were able to find the problem thread



Right Click ~~> Properties ~~> Threads. (TID 3996 is using 48%+ constantly)



We were able to get the stack for the thread and can see that it is constantly processing traps in the TrapReceiver

ntoskrnl.exe!_misaligned_access+0x13fd
ntoskrnl.exe!KeSynchronizeExecution+0x4552
libcaopenssl_crypto.dll!caopenssl_cryptolib_get_interface+0x75b19
libcaopenssl_crypto.dll!caopenssl_cryptolib_get_interface+0x765dd
libcapki.dll!capki_lib_get_interface+0x7d278
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x3b97
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x1c714
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x1c90f
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x28c97
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x27570
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x2ceda
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x2bbac
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x21807
libhpse.dll!TrapxTrapReceiver::initialize_socket+0x1dc41
libhpse.dll!CsHPSETrapReceiver::DoTask+0xb64
libhpse.dll!CsHPSETrapReceiver::DoTask+0xf5
libhpse.dll!CsHPSETask::threadRun+0x3d
KERNEL32.DLL!BaseThreadInitThunk+0x22
ntdll.dll!RtlUserThreadStart+0x34

A wireshark trace (capture filter 'port 162') showed that one specific device was flooding the SpectroSERVER with
  non-stop SNMP v3 traps.


Once this device was reconfigured to not send traps the SpectroSERVER's CPU utilization returned to normal

Attachments