newly added metrics not added to database after hub 23.4.0 upgrade - older hubs not affected
search cancel

newly added metrics not added to database after hub 23.4.0 upgrade - older hubs not affected

book

Article ID: 281023

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

I have recently upgraded to UIM 23.4.0 and after that I upgraded my hubs to 23.4.0.

I enabled some new metrics to be monitored on some probes (e.g. processes, cdm).

These probes were already successfully monitoring other metrics on these servers, but I do not see the newly added metrics in Operator Console metric viewer/reports/dashboards.

The old metrics that were working before the upgrade are still working.

When I check S_QOS_DATA the data is there, but when I look in CM_CONFIGURATION_ITEM_METRIC, the ci_metric_id is missing from that table.

The affected robots are running 23.4.0 under hub 23.4.0 as well.

Robots under older hubs are not affected.

I checked the ROBOT_NIS_CACHE queue and it is correctly configured.

 

 

Cause

This issue occurs on some servers with a large number (hundreds) of metrics already enabled and is related to a problem processing the niscache files.

A key element of this issue is that you will see messages as follows in the robot/controller log on startup showing more than one group of 100 files being posted:

From the controller log:

Mar 14 10:55:03:663 [139955100124928] 0 Controller: send_niscache: Thread created 
Mar 14 10:55:03:686 [139955100124928] 0 Controller: postNisCache - success - posted 100 files 
Mar 14 10:55:03:718 [139955100124928] 0 Controller: postNisCache - success - posted 100 files 
Mar 14 10:55:03:742 [139955100124928] 0 Controller: postNisCache - success - posted 94 files 
Mar 14 10:55:03:742 [139955100124928] 0 Controller: send_niscache: Thread exiting 

 

Resolution

This issue will be resolved in UIM 23.4.1 (CU1) and is related to an issue with discovery_server.

Attached to this article is a test fix which should resolve the issue in the meantime - discovery_server 23.40T2.

After deploying this fix to the primary hub, you should restart any affected robots and all the metrics should be correctly published and processed within 5 minutes.

 

 

Attachments

discovery_server_23.40T2.zip get_app