Raise alerts with logmon if value in log exceeds a defined threshold
search cancel

Raise alerts with logmon if value in log exceeds a defined threshold

book

Article ID: 281538

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

How do I create an alert when a threshold is exceeded in a log monitored with logmon? 

We have a log with a certain response time written in the last position in each line of the log. We would like to have an alarm when this value exceeds a certain threshold. 

This article provides guidance and a real-life example of how to set up logmon to achieve this scope. 

 

 

Environment

  • DX UIM 20.4.x / 23.4
  • logmon probe any version

Cause

  • Guidance

Resolution

Requirement: 

You have a log that looks like the one below where the last entry in each line is a "Response Time" in ms.

##.###.###.##  - - [13/Feb/2024:13:54:45 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.034
##.###.###.## - - [13/Feb/2024:13:54:46 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.040
##.###.###.## - - [13/Feb/2024:13:53:40 +0100] "POST /<path>/<service> /HTTP/1.0" 200 183 3.061
##.###.###.## - - [13/Feb/2024:13:53:46 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.042

You want an alarm where the last value in each line exceeds the value of "3.000"

Steps to Follow:

1. In the scenario we have the following log which updates new lines at every log interval. 

##.###.###.## - - [13/Feb/2024:13:54:45 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.034
##.###.###.## - - [13/Feb/2024:13:54:46 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.040
##.###.###.## - - [13/Feb/2024:13:53:40 +0100] "POST /<path>/<service> /HTTP/1.0" 200 183 3.061
##.###.###.## - - [13/Feb/2024:13:53:46 +0100] "POST /<path>/<service> HTTP/1.0" 200 45263 0.042

2. Create a profile for this log file with logmon, create an update type of profile and create 1 watcher. 

This regex below will match/capture the entire number, including the integer part, and then verify if it's greater than 3.000. 

Watcher should include the following REGEX inclosed into "/<regex>/"

.* ([3-9]\.\d{3}|[4-9]\d{2,}\.\d{3}|[1-9]\d{3,})$

Here you can verify the Regex: regex101: build, test, and debug regex

Create an "update" type of profile and set the file to monitor: 

Create the watcher:

with the following Regex:

/.* ([3-9]\.\d{3}|[4-9]\d{2,}\.\d{3}|[1-9]\d{3,})$/

And Test if it matches your log: 

3. if there is match in the log (exceeding last numeric number by 3000) an alarm is raised. Profiles can be tested: 

4. If needed, you can create a variable so you can extract the ResponseTime value: 

 

And use the value in an alarm.

For Reference this is logmon.cfg extract from this test: 

   <Response time>
      active = yes
      interval = 5 sec
    scanfile = C:\<file_to_monitor>.txt
      schedules = 
      fileencoding = 
      scanmode = updates
      alarm = yes
      qos = yes
      message = no
      subject = 
      user = 
      reccur_directory = no
      reccur_directory_level = 10
      resetFile = no
      initialfileptr = 2
      resumefileptr = 4
      command_timeout_active = no
      command_timeout = 
      command_severity = 2
      command_timeout_alarm = 0
      alarmFOpenFail = no
      clearFOpenFailRestart = no
      monitor_exit_code = No
      max_alarm_sev = 5
      max_alarms = 44
      max_alarm_msg = 
      password = 
      <watchers>
         <Response Time Watcher>
            active = yes
            match = /.* ([3-9]\.\d{3}|[4-9]\d{2,}\.\d{3}|[1-9]\d{3,})$/
            level = critical
            subsystemid = 
            message = Response Time is ${ResponseTime} which is greater than 3000
            i18n_token = 
            restrict = 
            expect = no
            abort = no
            sendclear = no
            count = no
            separator = 
            suppid = 
            source = ${var}
            target = 
            qos = AD_SERVER
            runcommandonmatch = no
            alarm_on_first_match = no
            commandexecutable = 
            commandarguments = 
            pattern_threshold_severity = information
            pattern_threshold_message = 
            timeout = 1
            count_operator = eq
            count_threshold = 
            pattern_threshold = 
            expect_message = 
            expect_level = 
            regexfromexternalfile = no
            patternfilepath = 
            token = 
            variable_threshold = 
            variable_threshold_message = exceeeded
            variable_threshold_severity = information
            variable_threshold_supp = exceeeded
            <variables>
               <ResponseTime>
                  definition = $1
                  operator = re
                  threshold = /.* ([3-9]\.\d{3}|[4-9]\d{2,}\.\d{3}|[1-9]\d{3,})$/
               </ResponseTime>
            </variables>
         </Response Time Watcher>
      </watchers>
   </Response time>

Additional Information

Please note that Broadcom Technical Support does not develop or maintain Regular Expressions or customizations so this is Example Material.

Official Documentation includes logmon Hints and Examples (broadcom.com)