Spectrum is showing periodic performance issues. Some of the symptoms:
- "CONTACT LOST TO SECONDARY SPECTROSERVER" alarms that set and clear almost at the same time
- Model activation taking longer than normal
- AlarmNotifier abnormally disconnecting as seen in the $SPEROOT/SS/VNM.OUT file. Example below:
Jun 24 02:37:50 ERROR TRACE at CsConnect.cc(524): Abnormally disconnected client - application "ServiceNOW_Notifier" sub app. "P:A SPECTRUM Alarm Notification Manager (SANM) application:CA" version "10.4.2.002"
- Mutex timeout messages as seen in the $SpeCROOT/SS/VNM.OUT file. Example below:
Jun 23 22:52:09 WARNING at CsHPSERequestSender.cc(1707): timeout mutex lock took 1195ms!
- dmp file created in the $SPECROOT/SS/support directory
Release : Any
Component : Spectrum Core / SpectroSERVER
For this particular instance, an analysis of the dmp file showed the following in the stack trace:
#24 0x00007f4a900d58f8 in GlobalCollectionIH::run_search(CsModelHandle const&, GlobalCollectionIH::SearchInitiator_e) () from /opt/SPECTRUM/lib/../SS/libmdlsvint.so.1
This is an indication of a Global Collection (GC) search causing the SpectroSERVER performance issue.
Some known reasons why a GC can cause a SpectroSERVER performance issue:
- A large number of GC's
- Real-Time update GC's
- A number of GC's are running their searches at approximately the same time
- GC's with large complex search criteria
- GC's using external attributes
Some recommendations:
- If possible, cut down on the number of GC's configured. Only configure GC's that are required
- Do not use Real-Time update GC's unless absolutely necessary
- Do not use external attributes where Spectrum has to perform an snmp get out to the models unless absolutely necessary
- Use as few AND and OR as possible
- Review the GC search criteria to help make it more efficient
- Configure the update only as often as needed
- Use the "Schedule Update" option instead of the "Run search to update Global Collection membership every xx hour(s)" option.
NOTE: One thing that is not well known is when you create a GC and set to run the search every xx hours, say 24 hours for example, the search is run when you click the OK button and the clock starts ticking at that time to run every 24 hours from the time you click the OK button.
So if you create or update multiple GC's around the same time, you can get multiple GC updates at approximately the same time.
To avoid this you can schedule GC's to run at specific times.