Primary SpectroSERVER hanging with high cpu
search cancel

Primary SpectroSERVER hanging with high cpu

book

Article ID: 8197

calendar_today

Updated On:

Products

CA Spectrum DX NetOps

Issue/Introduction

The Primary SpectroSERVER is hanging and showing high cpu. Tomcat disconnects and will not fail over to the Secondary SpectroSERVER

Environment

Release: Any
Component: SpectroSERVER

Cause

If the Primary SpectroSERVER is busy processing trap related events, it could cause high cpu to the point where the SpectroSERVER stops responding to client requests which could then result in OneClick disconnecting and preventing OneClick from failing over to the Secondary SpectroSERVER.

Resolution

NOTE: Starting from DX NetOps Spectrum 21.2.4, the default root password for MySql is "MySqlR00t". For DX NetOps Spectrum versions prior to 21.2.4, the default root password is "root". In the following MySql commands, replace <PASSWD> with the root password for your DX NetOps Spectrum version.

Check the Archive Manager DDM database for models that are logging a high number of events by doing the following:

1. Log into the SpectroSERVER system as the user that owns the Spectrum installation

2. If on Windows, start a bash shell by running "bash -login"

3. cd to the $SPECROOT/mysql/bin directory and enter the following command to print out the top 50 models logging the top 50 events where the start and end dates correspond to a 24 hour period in your environment where you are seeing the issue occur:

./mysql --defaults-file=../my-spectrum.cnf -uroot -p<PASSWD> ddmdb -e "select hex( type ), hex( e.model_h ), m.model_name, count( * ) as cnt from event e, model m where e.model_h = m.model_h and utime > UNIX_TIMESTAMP('2017-09-07 00:00:00') and utime < UNIX_TIMESTAMP('2017-09-08 00:00:00') group by type, e.model_h order by cnt desc limit 50"

The output will look similar to the following where the first column is the event id, the second column is the model handle, the third column is the model name and the fourth column is the number of events logged in the time period specified:

+-------------+------------------+--------------------------------------------------------+------+

| hex( type ) | hex( e.model_h ) | model_name                                             | cnt  |

+-------------+------------------+--------------------------------------------------------+------+

| 10F91       | 200000E          | SSPerformance                                          | 1440 |

| 4820002     | 200000E          | SSPerformance                                          | 1440 |

| 1022F       | 200000E          | SSPerformance                                          | 1440 |

| 1001D       | 2000278          | <Device Name1>                       |   74 |

| 1001D       | 200006E          | <Device Name2>                         |   29 |

| 10219       | 200006B          | <Device Name3>                                                   |   18 |

| 1021A       | 200006B          | <Device Name4>                                                 |   18 |

| 1001D       | 2000660          | <Device Name5>          |   12 |

4. You can then launch the Event Configuration editor, filter for the event id and display the "Trap Event" column in the Navigation panel to see if the top event id's are trap events:



If the high number of events are caused by traps from a few devices, determine why these devices are sending this volume of traps and address the issue at the device(s).

Attachments

1558693069443000008197_sktwi1f5rjvs16h6r.png get_app