search cancel

Service classifed as High Risk with no active alarms

book

Article ID: 215076

calendar_today

Updated On:

Products

DX Operational Intelligence

Issue/Introduction

We are using OI SAAS to monitor a few java processes running in a single server, and created a service in OI to aggregate the alerts. The last alert received was in April. So that generated two questions:

1) Why is the service Risk still considered Red/High? What is the criteria for that?

2) If I filter alerts for the last 24h for example, why does it still show alerts from 1 month ago?

A few screenshots to illustrate:

Risk still severe today:

 

Filter on top is May 6 to 7 (24h) but shows 11 alarms between March and April:

 

Environment

Release : 20.2

Component : CA DOI Foundations

Resolution

Here are some factors that may play a part in this case.

1. The risk is based upon the highest CI severity/significance on a scale of 0 to 4. The significance of a CI is based upon the number of inputs - number of outputs with a minimum of 1. You can look at the service topology to see where the open alarms are located, see the related input/output connections, and understand what is contributing to the risk.

2. The alert count is based upon alarms that are active (open) during the last 24hrs. This is why we see an alarm with a start time greater than the window as the alarms are still active. 

3. It seems like these are some older Anomaly Alarms not being closed. We need to clear/close these alarms manually and then monitor if new Anomaly Alarms do not get clear/close.

4. Inventory state always has a TTL value, so if all the alarms get clear and that moment if any reason the inventory state is not clear, it clear automatically when the inventory state TTL expire.

Attachments