search cancel

nas sync issues due to very high alarm counts


Article ID: 203532


Updated On:


DX Unified Infrastructure Management (Nimsoft / UIM)


Alarm counts are extremely high. How do I reduce the alarm counts?

Alarm counts are higher than expected for individual alarms.



NAS 9.20

DXIM robot 9.20HF15


Release : 9.2.0

Component : UIM NAS


- high alarm counts


In general, UIM Administrators must do their best to reduce alarm counts/alarm noise/unnecessary alarms.

Potential symptoms:

- Unexpected alarm behaviour, delays or timestamps
- nas GUI sync takes several seconds but may get worse over time due to the number of alarms in the database.db
- nas rules not processing in time or as expected
- Primary hub nas overloaded
- nas Status window doesn't show the alarms, needs to be refreshed. Remains empty at first until you click the refresh button.

database.db is large for example-> ~1.5 GB
nas transactionlog.db for example-> ~600KB

Monitoring Governance/Alarm Reduction

It is tempting to enable/'turn on' a lot of monitoring when you first deploy UIM but over time this can cause havoc. Best Practice is to only enable alarm thresholds for Key Performance Indicators (KPIs), that are associated with an upstream effect on business in some way. Try starting with a maximum of 5 KPIs per application/technology. Ask Support if we have any suggested KPIs for vmware, Citrix, Netapp, Nutanix, Exchange, etc. That is the base starting point - a small number of KPIs (key metrics). Always ask, why it’s important to collect the data (QOS or alarms), how often it needs to be collected and why, and how long it should be stored. Keep all of these monitoring aspects to a minimum. Note that some probes have monitoring enabled 'right out of the box' for many metrics - but customers must always decide which ones MUST be kept enabled versus the nice-to-have's.

UIM suppresses like alarms and updates the alarm count. But should we keep generating alarms in the hundreds/thousands letting the alarm counts increase exponentially? This is not a good practice as this can adversely affect nas performance, scalability and nas housekeeping (maintenance) as well as alarm displays and reliability, not to mention use of system resources as well. For any alarm suppression counts > 100, it begs the question, will the nas be able to handle more and more alarms with higher and higher counts? (not without running into performance/display or nas sync issues and even delays in processing), or display issues in the alarm view. You should take the necessary time to review and adjust monitoring policy and process for alarms and related ticket handling.

Why allow this to happen if no actions will be taken to alleviate the issue/resolve the problem? For example, of what use is it to have 8000+ Robot Inactive alarms continuously being increased when nothing is being done about it. Its just noise and it places an unnecessary load on the environment which usually worsens over time.