Fault tolerant SpectroSERVERs alarming

book

Article ID: 191817

calendar_today

Updated On:

Products

CA Spectrum CA eHealth

Issue/Introduction

What are the options for optimal alarm clears in fault tolerant SpectroSERVERs?

Environment

Release : 10.4.1

Component : Spectrum Core / SpectroSERVER

Resolution

We added to the vnmrc of the secondary
is_secondary

on both vnmrc we put:
wait_active is yes

Additional Information


Add the following line to the .vnmrc file on the secondary SpectroSERVER to limit the potential for false events or alarms:
is_secondary = TRUE
This setting lets the secondary SpectroSERVER drop events unless DX NetOps Spectrum determines that the secondary SpectroSERVER has taken over as the primary SpectroSERVER.

wait_active
Determines whether the server accepts connections as soon as all models are loaded or waits until all models are active. If set to Yes, a Control Panel message displays a running percentage of models that were activated during SpectroSERVER startup.
This command has the following format:
wait_active=no

When you restart the primary SpectroSERVER, connections are accepted when all models are loaded, but before all models are activated. The models can take some time to activate. Because the secondary SpectroSERVER stops polling when the primary SpectroSERVER is restarted, a gap in your network management coverage can result.
To avoid this situation, edit the .vnmrc file on the primary SpectroSERVER so that the wait_active resource is set to 'yes'. This parameter causes the server to wait until all of the models are activated before accepting any connections. The message area in the DX NetOps Spectrum Control Panel also dynamically displays the percentage of models that are activated. The SpectroSERVER can appear to take longer to come up. However when all the models are activated, the SpectroSERVER is ready to manage the network.


some more notes
the wait_active parameter is set to yes on the primary SS only to avoid missed alarms but then activation will take time.

The reason for this is after a primary SS restart, if set to no, then OneClick users will be swapped back over to the primary before model activation is completed. Normally, the secondary runs a a warm standby, meaning when the secondary SS is started, it goes through model activation but does not start polling the models in the database until it loses contact with the primary SS. Setting secondary_polling to yes on the secondary puts the secondary SS in a Hot Standby, meaning after model activation, it is actively polling the models in the database just like the primary. The wait_activate and secondary_polling parameters have nothing to do with the alarm sync process.