search cancel

Monitoring a command output with logmon probe is not working

book

Article ID: 256293

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

We are trying to configure logmon probe to monitor the output of a script on a Linux box and depending on the output generate an alarm.

The command that is being monitored is: if /home/dbuser/file2.pl  |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi

/home/dbuser/file2.pl return two possible output messages:

1. Everything is working OK

2. There is a problem

But with the below configuration, the Alarm is always generated, despite the /home/dbuser/file2.pl script never returned the message "There is a problem" executed from the terminal

 

 

Environment

Release : 20.4

Cause

If the command to be monitored return a specific output from a terminal, it doesn't mean the same output will be returned when the command is executed by the logmon.

Resolution

1. Executed from the command line, the output for the command is:

[email protected]:/home/dbuser# if /home/dbuser/file2.pl  |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi
Everything is working OK
OK

2. To find out the output the logmon is getting from the command, the recommended is to enable Log level to maximum detail, and the Log size to at least 10000

2. <nimsoft>/probes/system/logmon/logmon.log will show some outputs like next

Dec  6 18:49:43:454 [140508670613248] logmon: [test_command_output] start scanning 'if /home/dbuser/file2.pl  |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi'
Dec  6 18:49:43:454 [140508670613248] logmon: [test_command_output] storing file-stats as 'test_command_output'
Dec  6 18:49:43:455 [140508670613248] logmon: Encoding is 'ISO-8859-1'
Dec  6 18:49:43:455 [140508670613248] logmon: lgm: Read File
Dec  6 18:49:43:462 [140508670613248] logmon: lgm: read the line: [CRITICAL]
Dec  6 18:49:43:462 [140508670613248] logmon: lgm: check format start..[0]
Dec  6 18:49:43:462 [140508670613248] logmon: lgm: format start
Dec  6 18:49:43:462 [140508670613248] logmon: lgm: FORMAT END START
Dec  6 18:49:43:462 [140508670613248] logmon: (scan) TEST01 offset 0
Dec  6 18:49:43:462 [140508670613248] logmon: [test_command_output] In WithI18n section [CRITICAL],[ERCPY],[ISO-8859-1],[1]
Dec  6 18:49:43:462 [140508670613248] logmon: [test_command_output] MATCH [TEST01] on line 0
Dec  6 18:49:43:462 [140508670613248] logmon: Converting it to windows system default encoding
Dec  6 18:49:43:462 [140508670613248] logmon: [test_command_output] test_command_output.TEST01: Alarm Message, severity=5, sid=1.1, msg='CRITICAL' suppKey =

3. As per the message seen in the logs the logmon is not seeing the OK in the script output, hence is returning CRITICAL, which is why the alarm is generated always.
    To understand what is the script returning when executed by the logmon, the recommended is just monitor the script output, and review the logs looking for the "read the line".

4. Save the changes, restart the probe, and let's review the logs again:

Dec 6 19:29:26:777 [140311232206592] logmon: [test_command_output] start scanning '/home/dbuser/file2.pl'
Dec 6 19:29:26:777 [140311232206592] logmon: [test_command_output] storing file-stats as 'test_command_output'
Dec 6 19:29:26:778 [140311232206592] logmon: Encoding is 'ISO-8859-1'
Dec 6 19:29:26:778 [140311232206592] logmon: lgm: Read File
Dec 6 19:29:26:782 [140311232206592] logmon: lgm: read the line: [ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock']
Dec 6 19:29:26:782 [140311232206592] logmon: lgm: check format start..[0]
Dec 6 19:29:26:782 [140311232206592] logmon: lgm: format start
Dec 6 19:29:26:782 [140311232206592] logmon: lgm: FORMAT END START
Dec 6 19:29:26:782 [140311232206592] logmon: (scan) TEST01 offset 0
Dec 6 19:29:26:782 [140311232206592] logmon: [test_command_output] In WithI18n section [ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'],[ERCPY],[ISO-8859-1],[-1]
Dec 6 19:29:26:782 [140311232206592] logmon: [test_command_output] NO MATCH [TEST01] offset now 0
Dec 6 19:29:26:782 [140311232206592] logmon: No Match Found (Return Code : -1)

5. The script was returning an error to connect to MySQL, and that was why when executing

    if /home/dbuser/file2.pl  |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi'

    would never find the word OK, hence it would return always CRTICAL and the alarm would then be generated.

Additional Information


UIM - logmon - monitoring system commands

https://knowledge.broadcom.com/external/article?articleId=212305

Attachments