Why the SpectroSERVER went to a graceful shutdown?
search cancel

Why the SpectroSERVER went to a graceful shutdown?

book

Article ID: 245622

calendar_today

Updated On:

Products

Spectrum Network Observability

Issue/Introduction

Below are the last lines of the VNM.OUT file....

Jul 06 00:54:54 ERROR TRACE at CsIHOverCapacity.cc(432): SpectroSERVER available memory is below the threshold 1e+08going for a graceful shutdown!!!
Jul 06 00:54:54 ERROR TRACE at CsIHOverCapacity.cc(434):  Available memory :9.05052e+07
 Available Swap Memory :2.26099e+06

Jul 06 00:54:54 : SpectroSERVER has received shut down signal - scheduling shut down
Jul 06 00:54:54 : /opt/SPECTRUM/SS/SpectroSERVER is shutting down...
Jul 06 00:54:57 : Closing all client connections...
Jul 06 00:54:57 : Bypassing CORBA shutdown...
Jul 06 00:54:57 : Stopping /opt/SPECTRUM/SS/SpectroSERVER activity...

-----  NOTE  --------------------------------------------------------------
CA Technologies, A Broadcom Company recommends that the SpectroSERVER be
allowed to complete the shutdown process.
Database corruption may result if the SpectroSERVER is prematurely stopped.
-------------------------------------------------------------  NOTE  ------

Jul 06 00:54:57 :       waiting for model activates to complete...
Jul 06 00:54:57 :       waiting for model destroys to complete...
Jul 06 00:54:57 : Closing /opt/SPECTRUM/SS/SpectroSERVER database...
Jul 06 00:55:03 : /opt/SPECTRUM/SS/SpectroSERVER has successfully shut down.

Environment

Release : Any version

Component : Spectrum Core / SpectroSERVER

Cause

Due to a large event storm (44 million) in ~8 hours, this leads to a growing Archive Manager process (in RAM size). When the ArchMgr has almost exhausted the available memory, the SpectroSERVER gracefully shut down to avoid the crash and SSdb corruption.

1. Go to the $SPECROOT/mysql/bin directory of the SpectroSERVER machine:

cd $SPECROOT/mysql/bin

2. Launch the MySQL command prompt:

./mysql --defaults-file=../my-spectrum.cnf -u<USER> -p<PASSWORD> -A ddmdb

3. Redirect the MySQL command prompt output to a TEXT file:

mysql> \T queries1.txt

4. Query for highest event counts by type:

In the example below we are querying for the highest event counts by type for a given time range

mysql> SELECT hex(type), hex(node_id), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') group by type order by c desc;

mysql> SELECT hex(type), hex(node_id), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') group by type order by c desc;
+-----------+----------------------+----------+
| hex(type) | hex(node_id)         | c        |
+-----------+----------------------+----------+
10D31     | 00E282AC10C38EC28D01 | 11300815 |
10D53     | 00E282AC10C38EC28D01 | 11300771 |
10D91     | 00E282AC10C38EC28D01 | 11300771 |
10D52     | 00E282AC10C38EC28D01 | 11300771 |
| FFF004CF  | 00E282AC10C38EC28D01 |    38162 |
| FFF00963  | 00E282AC10C38EC28D01 |    35858 | 

5. Query for devices (model_handle) generating the most events in question:

In the example below we are querying for devices are generating the most 0x10d31 events for a given time range:

mysql> SELECT hex(model_h), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') and type=0x10d31 group by hex(model_h) order by c desc;
+--------------+----------+
| hex(model_h) | c        |
+--------------+----------+
22EDF4       | 11300771 |
+--------------+----------+
16 rows in set (11.07 sec) 

In the example below we are querying for devices are generating the most 0x10d53 events for a given time range:

mysql> SELECT hex(model_h), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') and type=0x10d53 group by hex(model_h) order by c desc;
+--------------+----------+
| hex(model_h) | c        |
+--------------+----------+
22EDF3       | 11300771 |
+--------------+----------+
1 row in set (11.06 sec)

In the example below we are querying for devices are generating the most 0x10d91 events for a given time range:

mysql> SELECT hex(model_h), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') and type=0x10d91 group by hex(model_h) order by c desc;
+--------------+----------+
| hex(model_h) | c        |
+--------------+----------+
22EDF4       | 11300771 |
+--------------+----------+
1 row in set (10.97 sec) 

In the example below we are querying for devices are generating the most 0x10d52 events for a given time range:

mysql> SELECT hex(model_h), count(*) as c from ddmdb.event where utime > UNIX_TIMESTAMP('2022-07-05 12:00:00') and utime < UNIX_TIMESTAMP('2022-07-05 20:00:00') and type=0x10d52 group by hex(model_h) order by c desc;
+--------------+----------+
| hex(model_h) | c        |
+--------------+----------+
22EDF3       | 11300771 |
+--------------+----------+
1 row in set (10.99 sec)

6. Query for event count per day:

mysql> select DATE(FROM_UNIXTIME(utime)), count(*) from event group by DATE(FROM_UNIXTIME(utime));

mysql> select DATE(FROM_UNIXTIME(utime)), count(*) from event group by DATE(FROM_UNIXTIME(utime));
+----------------------------+----------+
| DATE(FROM_UNIXTIME(utime)) | count(*) |
+----------------------------+----------+
| 2022-04-09                 |   564035 |
| 2022-04-10                 |   542096 |
| 2022-04-11                 |   557417 |
| 2022-04-12                 |   579198 |
| 2022-04-13                 |   590249 |
| 2022-04-14                 |   675363 |
| 2022-04-15                 |   617140 |
| 2022-04-16                 |   537948 |
| 2022-04-17                 |   542971 |
| 2022-04-18                 |   569617 |
| 2022-04-19                 |   585894 |
| 2022-04-20                 |   569477 |
| 2022-04-21                 |   583279 |
2022-04-22                 | 36101827 |
| 2022-04-23                 |  5523337 |
| 2022-04-24                 |   784222 |
| 2022-04-25                 |   596210 |
| 2022-04-26                 |   628542 |
| 2022-04-27                 |   697405 |
| 2022-04-28                 |  1015740 |
| 2022-04-29                 |   643581 |
| 2022-04-30                 |   635267 |
| 2022-05-01                 |   604833 |
| 2022-05-02                 |   617395 |
| 2022-05-03                 |   669662 |
| 2022-05-04                 |   658461 |
| 2022-05-05                 |   688003 |
| 2022-05-06                 |   666808 |
| 2022-05-07                 |   668522 |
| 2022-05-08                 |   623713 |
| 2022-05-09                 |   701014 |
| 2022-05-10                 |   696240 |
| 2022-05-11                 |   703791 |
| 2022-05-12                 |   693845 |
| 2022-05-13                 |   691178 |
| 2022-05-14                 |   630184 |
| 2022-05-15                 |   647208 |
| 2022-05-16                 |   888816 |
| 2022-05-17                 |  1081597 |
| 2022-05-18                 |   786319 |
| 2022-05-19                 |   830767 |
| 2022-05-20                 |   706203 |
| 2022-05-21                 |   683670 |
| 2022-05-22                 |   676177 |
| 2022-05-23                 |   712950 |
| 2022-05-24                 |   739191 |
| 2022-05-25                 |   727594 |
| 2022-05-26                 |   769473 |
| 2022-05-27                 |   724161 |
| 2022-05-28                 |   705586 |
| 2022-05-29                 |   809888 |
| 2022-05-30                 |   885478 |
| 2022-05-31                 |   867201 |
| 2022-06-01                 |   750377 |
| 2022-06-02                 |   727702 |
| 2022-06-03                 |   772453 |
| 2022-06-04                 |   706244 |
| 2022-06-05                 |   698816 |
| 2022-06-06                 |   745999 |
| 2022-06-07                 |   793581 |
| 2022-06-08                 |   776789 |
| 2022-06-09                 |   768763 |
| 2022-06-10                 |   753530 |
| 2022-06-11                 |   674413 |
| 2022-06-12                 |   696850 |
| 2022-06-13                 |   789657 |
| 2022-06-14                 |   848898 |
2022-06-15                 | 18897020 |
| 2022-06-16                 |   750310 |
| 2022-06-17                 |   723191 |
| 2022-06-18                 |   678373 |
| 2022-06-19                 |   675691 |
| 2022-06-20                 |   718845 |
| 2022-06-21                 |   757247 |
| 2022-06-22                 |   745634 |
| 2022-06-23                 |   779298 |
| 2022-06-24                 |   694516 |
| 2022-06-25                 |   659432 |
| 2022-06-26                 |   629970 |
| 2022-06-27                 |   680704 |
| 2022-06-28                 |   722197 |
| 2022-06-29                 |   722580 |
| 2022-06-30                 |   701408 |
| 2022-07-01                 |   651424 |
| 2022-07-02                 |   617037 |
| 2022-07-03                 |   601301 |
| 2022-07-04                 |   630493 |
2022-07-05                 | 45843416 |
| 2022-07-06                 |   570723 |
| 2022-07-07                 |   309737 |
+----------------------------+----------+
90 rows in set (5 min 4.02 sec)

7. End redirection and close the TXT file

mysql> \t

8. Upload the $SPECROOT/mysql/bin/queries1.txt file.

Resolution

The evens storm occurred on two models (WA_Link with the nested WA_Segment) connected to a router. But they are connected to a single port.

A Wide Area Link model represents a wide area connection between two router interfaces and includes:

  • A WA_Link model that appears in the topology view.
  • A WA_Segment model that exists within the WA_Link model and connects the two router interfaces together.

The WA_Link models can only represent point-to-point connections, such as T1 and T3 lines.

Deleted the WA_Link and WA_Segment models connected to a single port.

Additional Information

Perhaps you can create a watch on the SSPerformance model to monitor either the Archive Manager memory usage or the Operating System Available memory.