NetOps Spectrum Event Storage Best Practices

Products

Spectrum Network Observability

Issue/Introduction

This guide will suggest best practices when it comes to the overall event stream in NetOps Spectrum as it relates to Archive Manager, Report Manager and the SpectroSERVER.

Environment

Release: Any

Resolution

The event stream can vary from infrastructure to infrastructure. The following are suggestions to allow for best performance and overall health of the NetOps Spectrum databases.

Archive Manager

By default the Archive Manager is set to store 45 days worth of data. Unless contractually bound, this value should not be adjusted higher. There is no size limit on the Archive Manager.

Each event is stored in the DDMDB (Archive Manger database) as a row or record. Each row added to the underlining database will effectively increase the size of the database.

The larger this database becomes the more difficult it becomes to get data from it in a quick manner, and if it becomes corrupt it becomes more difficult to administer it and repair it.

Reducing the amount stored in the DDMDB will minimize the overall size, and thus increase overall performance when it comes to OneClick events tab and Spectrum Report Manager's ability to gather events more quickly.

The Archive Manager runtime configuration file, .configrc, is located in the $SPECROOT/SS/DDM directory. MAX_EVENT_DAYS defaults to 45. This should remain at 45, or be lowered to a different number like 30. If you have Report Manager, 45 is a good level to remain at. If you do not use Report Manager, lowering it is suggested.

The overall size of the DDMDB can be a problem if it is too large. If the database becomes corrupt, it will need a repair. Repairing the database quickly depends on size, and resources available on the system. The larger the database, the longer the repair time. Another problem with it being too large is events tab in OneClick can become slow. Some enhancements has been made to try and work around this but with anything large it will still take some time to populate the result set.

It is also suggested to run the DDMDB maintenance and optimize scripts monthly. These are located in the $SPECROOT/SS/DDM/scripts directory. Simply execute these via the command line.

How can you determine if your db is too large? There are a couple of ways and please keep in mind these are average values.

Navigate to the $SPECROOT/mysql/data/ddmdb directory and review the sizes of the event.MYD and event.MYI files. On average if the combined sizes is over 10GB, then you may want to consider investigating the events (see below for more info).
You can calculate the average number of events per second by reviewing the "EventsGenerated" attribute on the VNM model. Write down the value. Check back in an hour and note the value again. Take the difference of the two values and divide by 3600 to get the number of events generated per second. On average if the events per second is over 10, then you may want to consider investigating the events (see below for more info).

Locally Stored Events

When the Archive Manager is down, or not available due to corruption of the DDMDB, the SpectroSERVER will store events locally. The default value is 20,000 events. As previously mentioned, this amount would be used up in a short period of time. As events continue in over the 20,000 value, the older events will be purged and lost forever.

The value for locally stored events can be adjusted and Technical Support suggests setting this value to about 2 million (please note this does increase local file storage of the SSevents.db file in the SS directory, so make sure that you have adequate disk space available). On average, this will "allow" for a couple of days of the Archive Manager being down if you have a large event stream. During that time you should work to resolve the reason that the events are storing to the SSdb and not the DDMdb.

This setting is changed in the .vnmrc file located in the $SPECROOT/SS directory. Look for the value MAX_EVENT_RECORDS=20000, and set this to be MAX_EVENT_RECORDS=2000000

The SpectroSERVER will need to be restarted to have this value take effect, so it should be scheduled. It should also be adjusted on the secondary SpectroSERVERs.

Reducing event stream

The only way to reduce the overall event stream is to limit what is being stored in the DDMDB. You can do some easy MySQL queries to gather some information to give you an idea of the biggest offenders.

To logon to mysql:

Log into the SpectroSERVER system as the user that owns the Spectrum installation
If on Windows, start a bash shell by running "bash -login
Navigate to $SPECROOT/mysql/bin directory and enter the following command to log into mysql

./mysql --defaults-file=../my-spectrum.cnf -uroot -p<passwd> ddmdb -A

The following should be pasted into the mysql> prompt, each query starts with a SELECT and ends with a semi-colon (;):

# To get a count of the # of events that occured after a set date:
SELECT count(*) FROM event WHERE utime >= UNIX_TIMESTAMP("2008-01-01");

# To get the Top 10 events most commonly generated:
SELECT hex(type), COUNT(*) as cnt
FROM event GROUP BY type
ORDER BY cnt DESC LIMIT 10;

# To get the Top 10 models with the most events:
SELECT hex(e.model_h), m.model_name, COUNT(*) as cnt
FROM event e, model m WHERE e.model_h=m.model_h
GROUP BY e.model_h
ORDER BY cnt DESC LIMIT 10;

# To get the Top 10 high-volume days for events:
SELECT date(from_unixtime(utime)) as x, count(*) as cnt
FROM event GROUP BY x
ORDER BY cnt DESC LIMIT 10;

# To get the last 10 days volume of events:
SELECT date(from_unixtime(utime)) as x, count(*) as cnt
FROM event GROUP BY x
ORDER BY x DESC LIMIT 10;

Once the largest offender events are identified, a decision can be made about whether or not those are actually needed. For example, AUTHENTICATION FAILURE events are commonly not needed to be seen in most environments and are mere informative events. The alarms will continue to be generated but you can prevent the actual event from being stored in the DDMDB.

To do that:

Login to your OneClick Client
Click Tools > Utilities > Event Configuration
Once loaded, filter on the event code you would like to prevent from being stored.
In the details area of this event, go to the Event Options tab.
Deselect "Store Event in Historical Database"
Click the Save button. Click okay to any pop ups.

This will update the event configuration in NetOps Spectrum and from that point forward that event will no longer be stored in the DDMDB. This means the event will not be searchable via the events tab.

Report Manager

NetOps Spectrum Report Manager is what should be used for historical event reports for historical purposes. It also provides a GUI that presents the data in a more professional manner.

It has a better ability to store large amounts of data. Its architecture is designed to gather events from the DDMDB on a polled basis. This happens quickly without any customer interaction. It will gather roughly 10,000 events each poll and process those into its own database (reporting).

Storing lots of event data in the DDMDB is counter productive if you own Report Manager.

Technical Support suggests utilizing Report Manager as your historical event database. Leaving 45 days in your Archive Manager database means if something happens to the Report Manager database, you can always reprocess the DDMDB data.

How that is done:

Navigate to the $SPECROOT/bin directory on the NetOps Spectrum Report Manager (SRM) OneClick Host.
Type the following to remove all data from the current reporting database and populate the 45 days:

./RpmgrInitializeLandscape.bat root <PASSWD> -initHist 45 -all

This will remove all bucket tables and truncate the tables. Then it sets the event_sync_time back 45 days from, now(). Which is the time of execution.

Additional Information

Please reference the following documentation sections for more information:

Database Management

Reporting Database Management

If you have any questions regarding best practices as it relates to the events, alarms, or anything else within NetOps Spectrum it is suggested to reach out to the Broadcom DX NetOps User Community board. If you do not get what you need from the community, you can contact Broadcom Technical Support.