Dynamic monitoring and thresholds - all baseline (BN_QOS_DATA_*) tables are empty
search cancel

Dynamic monitoring and thresholds - all baseline (BN_QOS_DATA_*) tables are empty

book

Article ID: 258042

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

When trying to finish the process of configuring baseline collection we found that ALL of the baseline tables are empty even though we checked the particular tables that should contain the baseline data.

Prior to the DB migration, as per the customer, baselining and thresholds alarming was working as expected even when the standard deviation option was being used as well.

- customer migrated from an earlier version of MS SQL Server up to v2017
- Noted in the UIM environment, that server names/hub names were changed as well
- DB migration occurred ~3 years ago (Dec-Jan. 2021). 
- Customer/we are not sure just HOW the migration was performed/took place.
- The issue with baselining started after the DB migration, as reported by customers.

qos_processor errors:

Jan 09 11:41:48:597 [attach_clientsession, qos_processor] method = public void com.nimsoft.slm.qos.processor.AbstractReceiver.onBulkQueueMessage(com.nimsoft.nimbus.NimSession,com.nimsoft.nimbus.PDS) throws com.nimsoft.nimbus.NimException

and 

[Qos Baseline Processor #1, qos_processor] Load single QoS query: select d.table_id, d.checksum, d.qos_def_id, d.table_id, d.ci_metric_id, d.qos, d.source, d.target, d.origin, d.host, d.robot, d.probe, d.nim_origin, d.modifier, d.samplerate from s_qos_data d where d.checksum = '7C223C9D59BE5D0A9526219BC7A4F09FAEBBB247'
[Qos Baseline Processor #1, qos_processor] No Qos Loaded
[Qos Baseline Processor #1, qos_processor] Loaded null
[Qos Baseline Processor #1, qos_processor] Could not find qos: S_QOS_DATA {  qos_name: QOS_NETWORK_AGGREGATED_TRAFFIC, source: <xxxx.example.om>, target: <xxx.xxx.xxxx.example.com>, nim_origin: null, origin: <hubname>, modifier: nimsoft, host: ##.###.##.##, robot: <robot_hostname>, probe: baseline_engine, table_id: null, qos_def_id: null }
[Qos Baseline Processor #1, qos_processor] Not updating db: S_QOS_DATA {  qos_name: QOS_NETWORK_AGGREGATED_TRAFFIC, source: <xxxx.example.om>, target: <xxx.xxx.xxxx.example.com>, nim_origin: null, origin: <hubname>, modifier: nimsoft, host: ##.###.##.##, robot: <robot_hostname>, probe: baseline_engine, table_id: null, qos_def_id: null }
[Qos Baseline Processor #1, qos_processor] Cache miss: true

Environment

  • UIM 20.3.3
  • data_engine 20.31
  • baseline_engine 20.42
  • ppm 20.10
  • qos_processor 20.42/20.43

Cause

- corrupt/invalid checksums post database migration from an earlier version of MS SQL Server

Resolution

The issue of empty baseline tables was resolved by updating the qos_processor.

The qos_processor probe is responsible for:

  • ​updating the origin changes in the database when the origin changes at the robot/hub
  • ​allowing custom enrichment of QoS metric properties
  • ​​​saving baseline data to the database and matching alarms to QoS data


Downloaded and deployed qos_processor 20.43T1 test build with additional logging and also slightly tweaked the checksum validation (driven by a configuration key). A new configuration key was added in the qos_processor.cfg as shown below :

   use_ci_metric_id_to_load_qos = false 

Note that the new enhanced version of qos_processor 20.43T1, is attached to this KB Article.

In the qos_processor probe, we use a checksum to load the qos whenever a new baseline qos is generated. With this new parameter, if the checksum is not available, then we use ci_metric_id to load the qos from the s_qos_data table. This is the default behavior of the probe, but if we set use_ci_metric_id_to_load_qos = true, then qos_processor probe will be using the ci_metric_id first to load the QoS and if ci_metric_id is null then it use the checksum to load the qos from s_qos_data table.

   use_ci_metric_id_to_load_qos = true

In the qos_processor.log, if baseline values are being inserted into the database, you will see messages similar to these listed below:

Jan 12 09:03:43:860 [Qos Baseline Processor #1, qos_processor] Use ci_metric_id to load the qos. If it is null then use checksum to load the qos.
Jan 12 09:03:43:860 [Qos Baseline Processor #1, qos_processor] Load single QoS query: select d.table_id, d.checksum, d.qos_def_id, d.ci_metric_id, d.qos, d.source, d.target, d.origin, d.host, d.robot, d.probe, d.nim_origin, d.modifier, d.samplerate from s_qos_data d where d.ci_metric_id = 'MC13B2xxxxxxxx4A64611148'
Jan 12 09:03:43:861 [Qos Baseline Processor #1, qos_processor] Loaded S_QOS_DATA {  qos_name: QOS_MEMORY_PHYSICAL, source: xxxxxx-xxx-xx, target: xxxxxxxx, nim_origin: <hubname>, origin: xxxxxx-xxxxxx, modifier: nimsoft, host: ##.###.###.##, robot: xxxxxx-xxxxxx, probe: cdm, table_id: 22, qos_def_id: 21 }
Jan 12 09:03:43:861 [Qos Baseline Processor #1, qos_processor] Putting into cache: S_QOS_DATA {  qos_name: QOS_MEMORY_PHYSICAL, source: xxxxxx-xxxx, target: xxxxxx-xxxx, nim_origin: <hubname>, origin: <hubname>, modifier: nimsoft, host: ##.###.###.##, robot: <robot_hostname>, probe: cdm, table_id: 22, qos_def_id: 21 }
Jan 12 09:03:43:863 [Qos Baseline Db Updater #1, qos_processor] QosBaselineUpdater updating 1 baselines.
Jan 12 09:03:43:863 [Qos Baseline Db Updater #1, qos_processor] insert  into BN_QOS_DATA_0021 (table_id, starttime, stoptime, time_interval, samplevalue) values (?, ?, ?, ?, ?)
Jan 12 09:03:43:863 [Qos Baseline Db Updater #1, qos_processor] Total time batching before execution: 0

Once use_ci_metric_id_to_load_qos was set to true and a few hours passed, the baseline tables (BN_QOS_DATA_*) were populated.

This of course implies that Dynamic monitoring and thresholding has been configured. For example:

 

 

Additional Information

As of UIM 20.4 the Performance Reports Designer (PRD) supports enabling the baseline data in a chart/report. Here is an example below showing the developed baseline:

 

 

 

Attachments

qos_processor_1674163536580.zip get_app