Persistant Alerts Clearing automatically and causing multiple tickets for same issue

book

Article ID: 227117

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

This is the result of active alerts been cleared and re-applied on new polling cycles, leading to hundreds of excessive tickets creation.

Cause

The threshold set on the response time  is near  the actual data collection points, leading to response time breaching  the threshold  time continuously during the polling intervals.

Since the DNS/Response time metric takes higher precedence over the regex match, any alarms generated due to the regex match will be auto cleared.

Error messages

Aug 19 08:47:30:719 [7952] url_response: Sent clear alarm for y+000000000#URLResp/WebAnalystErr1280/regexMatch

Aug 19 08:47:30:719 [7952] url_response: Sent alarm(3) URL response for 'WebAnalystErr1280' is 12360 ms, which exceeds the threshold (11000 ms)

Environment

Release : 20.3

Component : UIM - SPECTRUMGTW,UIM - URL_response probe

Resolution

Set the threshold to a higher value (based on the QoS trend for response time or matching to the actual web page timeout) to avoid the alarm noise.
If the threshold value can’t be increased, then the alarm configuration for RegEx should be configured through a different monitoring profile so that the precedence logic will not be applicable.

threshold time for the web page can be found in the probe configuration file and the web page timeout can be found in the general properties of the  Url response  probe configuration.

      <alarm>
         active = yes
         max_samples = 5
         average = no
         threshold = 8000
         dns_resolution_time = 20
      </alarm>

Additional Information

Note: One of the reasons for the precedence logic is that if the webpage itself is not loaded or timed out, checking for the regex match doesn’t yield any result other than generating the alarm noise.

Attachments