Fault tolerant SpectroSERVERs alarming
search cancel

Fault tolerant SpectroSERVERs alarming

book

Article ID: 191817

calendar_today

Updated On:

Products

CA Spectrum DX NetOps

Issue/Introduction

What are the options for optimal alarm clears in fault tolerant SpectroSERVERs?

Events are seen with Precedence 20. Why are the Events raised by the secondary SS? The secondary never took over so Events should all be raised against the primary SS only.

I see Events with the yellow Minor severity color, but no Severity value. The model the Event is raised against shows no Minor severity alarms.

Environment

All supported DX NetOps Spectrum releases

 

Cause

Normally when this is observed there was a temporary communication issue between OneClick and the primary SpectroSERVER.

Resolution

To resolve this ensure the secondary SpectroSERVER has these entries added to the $SPECROOT/SS/.vnmrc file.

  • is_secondary=TRUE
  • wait_active=yes

If they are found missing or set incorrectly, edit the .vnmrc to correct the configuration. Save the changes to the .vnmrc file and restart the SS to read the .vnmrc file changes in.

Additional Information

is_secondary

  • This setting lets the secondary SpectroSERVER drop events unless DX NetOps Spectrum determines that the secondary SpectroSERVER has taken over as the primary SpectroSERVER.
  • Should only be set to yes in the .vnmrc file on a secondary SpectroSERVER.

wait_active

  • Determines whether the server accepts connections as soon as all models are loaded or waits until all models are active.
  • If set to Yes, a Control Panel message displays a running percentage of models that were activated during SpectroSERVER startup.
  • The wait_active parameter is set to yes on the primary SS only to avoid missed alarms, but activation may take longer.

Additional details abot these settings Add the following line to the .vnmrc file on the secondary SpectroSERVER to limit the potential for false events or alarms:
is_secondary = TRUE

When we restart the primary SpectroSERVER, connections are accepted when all models are loaded, but before all models are activated. The models can take some time to activate. Because the secondary SpectroSERVER stops polling when the primary SpectroSERVER is restarted, a gap in your network management coverage can result.

To avoid this situation, edit the .vnmrc file on the primary SpectroSERVER so that the wait_active resource is set to 'yes'. This parameter causes the server to wait until all of the models are activated before accepting any connections. The message area in the DX NetOps Spectrum Control Panel also dynamically displays the percentage of models that are activated. The SpectroSERVER can appear to take longer to come up. However, when all the models are activated, the SpectroSERVER is ready to manage the network.

The reason for this potential delay is after a primary SS restart, if set to no, OneClick users will be swapped back over to the primary SS before model activation is completed. Normally, the secondary SS runs in a Warm Standby state. This means when the secondary SS is started, it goes through model activation but does not start polling the models until it loses contact with the primary SS. Setting secondary_polling to yes on the secondary SS puts the secondary SS in a Hot Standby state. This means after model activation, it is actively polling the models same as the primary SS. The wait_activate and secondary_polling parameters have nothing to do with the alarm sync process.