Spectrum Processes Not Stopping in Order when machine is rebooted (Archmgr may get corrupted)
search cancel

Spectrum Processes Not Stopping in Order when machine is rebooted (Archmgr may get corrupted)

book

Article ID: 186708

calendar_today

Updated On:

Products

CA Spectrum

Issue/Introduction

The SpectroSERVER had a scheduled/planned reboot. However, the ArchMgr reported errors during shutdown. 

After investigating the logs, it looks like the ArchMgr issue is related to mysqld shutting down too quick. Looks like Archmgr was still trying to write to mysql
  when it shut itself down 

When the system is rebooted processes are not shutting down in an ordered fashion. The
  Processes all seem to shutdown at the same time. This causes issues for the ArchMgr as
  MySql is shutdown before ArchMgr finishes.

 

ARCHMGR.OUT
————————————
Mar 13 00:04:27 : ArchMgr has received shut down signal - scheduling shut down
Mar 13 00:04:27 : Closing all client connections...
Mar 13 00:04:27 : Bypassing CORBA shutdown...
Mar 13 00:04:27 : ArchMgr has received shut down signal - scheduling shut down

Mar 13 00:04:27 : Closing database
Mar 13 00:04:27 ERROR TRACE at ModelArchDBImp.cc(4970): doSqlQuery/mysql_query: Failure executing query:
UPDATE statistic SET end_utime = 1584029067 WHERE end_utime IS NULL - Server shutdown in progress

 

MYSQL.OUT
————————————
2020-03-12T16:04:27.277988Z 0 [Note] Giving 2 client threads a chance to die gracefully
2020-03-12T16:04:27.278022Z 0 [Note] Shutting down slave threads
2020-03-12T16:04:29.278161Z 0 [Note] Forcefully disconnecting 0 remaining clients
2020-03-12T16:04:29.278204Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2020-03-12T16:04:29.278366Z 0 [Note] Binlog end
2020-03-12T16:04:29.279324Z 0 [Note] Shutting down plugin ‘ngram’


VNM.OUT
——————————
Mar 13 00:04:27 : SpectroSERVER has received shut down signal - scheduling shut down
Mar 13 00:04:27 : /opt/spectrum/SS/SpectroSERVER is shutting down…
Mar 13 00:04:27 : SpectroSERVER has received shut down signal - scheduling shut down

Mar 13 00:04:27 : Closing all client connections…
Mar 13 00:04:27 : Bypassing CORBA shutdown…
Mar 13 00:04:27 : Stopping /opt/spectrum/SS/SpectroSERVER activity…

Environment

Release : 10.4.1

Component : Spectrum Core / SpectroSERVER

Cause


It appears that the OS was sending all processes shutdown signals when going down for reboot. This means the processes
  all started to shut down at the same time and outside of processd's control.

Resolution


In testing this appears to happen when using the `reboot` command where as when using `shutdown -r` the rcx.d scripts (K89processd)
  were able to correctly trigger and processd was able to stop the processes in correct order.

Options:
   If using the `reboot` command
       call stopSS.pl prior to the reboot
       then call `reboot`
 
  - Use `shutdown -r`
      In testing this `shutdown -r` correctly triggered the rcx.d scripts (K89processd) to stop
        all processes. Processd then stops processes in the correct order (ArchMgr, SS, MySQL ..etc)


Note: If possible it is recommended to stop the SpectroSERVER and ArchMgr prior to reboot

   

Additional Information

example processd_log.bak when using `reboot` with debug enabled prior to the reboot

Mar 17 11:32:56 DEBUG START




example processd_log.bak when using `shutdown -r`

Mar 17 11:54:24 DEBUG START
Mar 17 11:56:01 Requesting stop of all tickets with a priority of 30
Mar 17 11:56:01 Stopping - SPECTRUM Archive Manager,priority 30
Mar 17 11:56:01 Waiting for priority 30 tickets to stop.
Mar 17 11:56:03 TICKET IS DEAD, pid = 7566
Mar 17 11:56:03 DATE OF THIS REQUEST

~
~

Mar 17 11:56:03 All tickets with a priority of 30 have been stopped.
Mar 17 11:56:03 Requesting stop of all tickets with a priority of 20
Mar 17 11:56:03 Stopping - SpectroSERVER Daemon,priority 20
Mar 17 11:56:03 Stopping - NCM Service,priority 20
Mar 17 11:56:03 Waiting for priority 20 tickets to stop.
Mar 17 11:56:04 TICKET IS DEAD, pid = 7277
Mar 17 11:56:04 DATE OF THIS REQUEST
~
~

Mar 17 11:56:06 All tickets with a priority of 15 have been stopped.
Mar 17 11:56:06 Requesting stop of all tickets with a priority of 10
Mar 17 11:56:06 Stopping - SPECTRUM MYSQL Database Server,priority 10
Mar 17 11:56:06 Stopping - Visibroker Naming Service,priority 10
Mar 17 11:56:06 Stopping - TELNET Relay Daemon,priority 10
Mar 17 11:56:07 TICKET IS DEAD, pid = 7275
Mar 17 11:56:07 DATE OF THIS REQUEST