SpectroSERVER process taking 100% cpu due to Moogsoft and tomcat running out of memory
search cancel

SpectroSERVER process taking 100% cpu due to Moogsoft and tomcat running out of memory

book

Article ID: 138174

calendar_today

Updated On:

Products

CA Spectrum DX NetOps

Issue/Introduction

After enabling Moogsoft to integrate with Spectrum, major performance problems were noticed. The SpectroSERVER may consume 100% cpu and users may not be able to log in to Oneclick.  If performance stack dumps are gathered, the stack output will show alarm filtering as seen below.  OneClick Tomcat thread dumps will show POST activities with getAlarmsByXML:

 

at com.ca.spectrum.restful.servlet.AlarmServlet._POST_getAlarmsByXml(AlarmServlet.java:566)

 

Linux SS stack output:

#1 0x00007f17d35af1b9 in CsGenAttrsIter::get_next_attr() () from /opt/SPECTRUM/lib/libVPapi.so.1

#2 0x00007f17d774a76a in CsGlobalAlarmClient::copy_desired_attrs(CsGlobalAlarmAttrs const&, CsAttrReadReqVPList const&, CsGlobalAlarmAttrs*) () from /opt/SPECTRUM/lib/../SS/libgas.so.1

#3 0x00007f17d7747030 in CsGlobalAlarmClientHandler::get_alarms(CsCAttribute::CsCValue::_VISanon_seq_0_CsCAttribute__CsCValue const*, CsAttrReadReqVPList*, CsSecurityIf const*) () from /opt/SPECTRUM/lib/..

/SS/libgas.so.1

#4 0x00007f17d7742b8a in CsAlarmDomainSrvc::getAlarmListWithAttrsNoFiltering(CsCAttribute::CsCValue::_VISanon_seq_0_CsCAttribute__CsCValue const&, CsAttrReadReqVPList const*, CsSecurityIf const&)

Windows SS stack output:

00007ffa`2f6c49ff : 0000007f`00000008 0000007f`85b7fa90 00007ffa`1cc26d88 0000007f`00011f4e : libgas!CsGlobalAlarmFilterParser::comparison_operation+0x29e [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\gas.a\managers\src\csgalarmfp.cc @ 507]

00007ffa`2f6c4c00 : 0000007f`85b7fa90 00000080`3453bf50 00007ffa`1cc26d88 00000080`3453bed8 : libssorbutil!CsAttrFilterParser::operation+0x9f [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\ssorb.a\util\src\csattrfltp.cc @ 962]

00007ffa`1cbfa08d : 0000007f`85b7fa90 00007ffa`1cc26d88 0000007f`a5a76101 0000007f`85b7fa90 : libssorbutil!CsAttrFilterParser::parse+0x4c [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\ssorb.a\util\src\csattrfltp.cc @ 676]

00007ffa`1cbf6ef4 : 0000007f`85b7fa90 0000007f`85b7fa00 00000000`00000000 00007ffa`00000001 : libgas!CsGlobalAlarmFilterParser::test+0x39 [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\gas.a\managers\src\csgalarmfp.cc @ 333]

00007ffa`1cbf1448 : 00000080`1526af50 00000080`010c2a7e 00000080`3453bf50 0000007f`85b7fbb0 : libgas!CsGlobalAlarmClientHandler::get_alarms+0x1d0 [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\gas.a\managers\src\csgaclienth.cc @ 562]

00007ffa`1cbec538 : 00000080`1526fc40 0000007f`85b7fbf0 00000000`00000000 00000080`022cfc48 : libgas!CsAlarmDomainSrvc::getAlarmListByAttrFilter+0xa8 [d:\spectrum\10.02.01\cm\windows\10.02.01.00.98\gas.a\corba\src\csalrmdsrvc.cc @ 541]

 

If you review the LocalHostAccessLog files for OneClick and search for "REST" you will see <user> running many REST alarm queries:

 

Line 586: - <user> [03/Oct/2019:00:20:50 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 3110 67718
Line 644: - <user> [03/Oct/2019:00:22:55 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 13464 65167
Line 712: - <user> [03/Oct/2019:00:25:27 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 114 91603
Line 771: - <user> [03/Oct/2019:00:27:55 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 114 88756
Line 835: - <user> [03/Oct/2019:00:30:35 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 114 99280
Line 888: - <user> [03/Oct/2019:00:32:55 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 114 80193
Line 941: - <user> [03/Oct/2019:00:35:25 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 28369 89942
Line 1007: - <user> [03/Oct/2019:00:38:03 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 20194 98286
Line 1046: - <user> [03/Oct/2019:00:39:48 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 3105 44722
Line 1100: - <user> [03/Oct/2019:00:41:52 -0500] POST /spectrum/restful/alarms HTTP/1.1 200 6084 63906


Notice the very long response times at the end of the entries.

Environment

Release : 23.x. 24.x

Component : Spectrum Core / SpectroSERVER

Cause

Moogsoft is quering for too much data too quickly.

Resolution

Moogsoft integration hitting the servers with POST Alarm queries causing processing issues. You need to either:


1.  Disable Moogsoft from accessing Spectrum Tomcat

2.  Narrow down or limit the amount of data Moogsoft is querying for - by default Moogsoft queries Spectrum for all alarm attributes.  This is a very expensive call as there are calculated attributes.  Limit the REST query Moog uses to only what is needed by Moogsoft.

3.  Reduce the frequency at which Moogsoft queries Spectrum for data

4.  Use a POST Alarm subscription to update Moogsoft for alarm create/update/destroy only:

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/spectrum/24-3/programming/web-services-api-reference/how-to-use-the-ca-spectrum-web-services-api/restful-resources-nouns/subscription.html#concept.dita_dc917a7753561dae07893f6a680d13130277b9ea_POSTSubscription