How Archive Manager works in a Fault-Tolerant environment
search cancel

How Archive Manager works in a Fault-Tolerant environment

book

Article ID: 236249

calendar_today

Updated On:

Products

CA Spectrum DX NetOps

Issue/Introduction

How does the Archive Manager work in a Fault-Tolerant environment?

 

Resolution

Primary and Secondary SpectroSERVERs are Operational

You can run the Archive Manager on the secondary SpectroSERVER host in a fault-tolerant SpectroSERVER environment. This secondary Archive Manager only provides visibility to events in OneClick when the primary Archive Manager is down.

Primary or secondary SpectroSERVER locally stores events in the following two scenarios:

  1. When the primary Archive Manager is down, and the primary SpectroSERVER is running. In this case, primary SpectroSERVER locally stores events (to the $SPECROOT/SS/SSEvents.db file) as they are created until the primary Archive Manager is up.
  2. When the primary SpectroSERVER process itself is down. In this case, the secondary SpectroSERVER locally stores events (to the $SPECROOT/SS/SSEvents.db file) as they are created until the primary Archive Manager is up.

When you start the secondary Archive Manager, it acts as a client to the primary SpectroSERVER to receive and log events as they are created. This behavior does not affect the normal connection between the primary SpectroSERVER and the primary Archive Manager. As soon as the primary Archive Manager goes down, OneClick fails over to the secondary Archive Manager to provide event data.

 

 

Primary SpectroSERVER process has failed (but the primary Archive Manager is up and running)

The primary SpectroSERVER stops. The secondary SpectroSERVER then forwards events and statistical information to the primary Archive Manager that is running on the server that hosts the primary SpectroSERVER. When the primary SpectroSERVER restarts, no event and statistical data have been lost.

 

Primary SpectroSERVER host has failed  (both primary SpectroSERVER and primary Archive Manager are down)

The computer where the primary SpectroSERVER and the primary Archive Manager are running stops operating completely. The secondary SpectroSERVER then caches event and statistical data in its database until the primary SpectroSERVER computer comes back online. If a secondary Archive Manager is running, historical, and real-time information is available in OneClick, but the information is still cached for transfer to the primary Archive Manager.

Restart both the primary Archive Manager and the primary SpectroSERVER if their server does down, or if the primary SpectroSERVER stops operating.

Note: It is no longer necessary to start the Archive Manager before the SpectroSERVER, the cached events from the secondary SpectroSERVER can be transferred at any time, even after the primary SpectroSERVER has started logging new events.

Follow these steps:

  1. Start the SPECTRUM Control Panel on the primary SpectroSERVER host.
  2. To start the SpectroSERVER, click Start SpectroSERVER on the SPECTRUM Control Panel.
  3. When the primary Archive Manager is again operational, the secondary SpectroSERVER connects and transfers its cached event data to the primary Archive Manager.

When the primary SpectroSERVER process itself goes down, the secondary SpectroSERVER locally stores events  but also forwards events to the secondary Archive Manager. When the primary Archive Manager comes up, the secondary SpectroSERVER transfer all the locally stored events from the SSEvents.db to it.  The number of locally stored events is on the VNM model in the SpectroSERVER Control - Event Log Information window:

The secondary Archive Manager will also store events to the secondary host mysql.  These events are not transferred to the primary SS. 

If you find that the amount of Locall Stored Events has hit the "Max Log Size" of events, you will start to see Events Purged.  This means they are being removed from the SSEvents.db file and will not be synced to the primary mysql db.

If those events are needed, for example if the Events Purged are in the thousands or millions, you can use mysql dump on the secondary and import it to the primary before starting the SS/ArchMgr on the primary.

 

Additional Information

Please review the following KB article about sizing the temporary local Event Storage on the SSdb database:

https://knowledge.broadcom.com/external/article?articleId=20967