We are trying to configure the logmon probe to monitor the output of a script on a Linux box and depending on the output, generate an alarm.
The command that is being monitored is: if /home/dbuser/file2.pl |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi
/home/dbuser/file2.pl returns two possible output messages:
1. Everything is working OK
2. There is a problem
But with the configuration shown below, an alarm is always generated, despite the /home/dbuser/file2.pl script never returning the message "There is a problem" when it is executed from the terminal.
1. Command line execution
When executed from the command line, the output for the command is:
root@cluster01:/home/dbuser# if /home/dbuser/file2.pl |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi
Everything is working
OK
2. Viewing the output (DEBUG)
To view output that logmon is getting from the command, set loglevel to 5, and set the logsize to at least 10000 as shown above.
3. Debugging the command output via the log
<nimsoft>/probes/system/logmon/logmon.log will show some outputs similar to what is shown below:
logmon: [test_command_output] start scanning 'if /home/dbuser/file2.pl |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi'
logmon: [test_command_output] storing file-stats as 'test_command_output'
logmon: Encoding is 'ISO-8859-1'
logmon: lgm: Read File
logmon: lgm: read the line: [CRITICAL]
ogmon: lgm: check format start..[0]
logmon: lgm: format start
logmon: lgm: FORMAT END START
logmon: (scan) TEST01 offset 0
logmon: [test_command_output] In WithI18n section [CRITICAL],[ERCPY],[ISO-8859-1],[1]
logmon: [test_command_output] MATCH [TEST01] on line 0
logmon: Converting it to windows system default encoding
logmon: [test_command_output] test_command_output.TEST01: Alarm Message, severity=5, sid=1.1, msg='CRITICAL' suppKey =
4. Analyzing the results
As per the message seen in the logs, logmon is not seeing 'OK' in the script output, hence it is returning CRITICAL, which is why the alarm is always generated.
To understand what the script is returning when executed by logmon, monitor the script output, and review the logs looking for the "read the line".Save the changes, restart the probe, and review the logs again:
logmon: [test_command_output] start scanning '/home/dbuser/file2.pl'
logmon: [test_command_output] storing file-stats as 'test_command_output'
logmon: Encoding is 'ISO-8859-1'
logmon: lgm: Read File
logmon: lgm: read the line: [ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock']
logmon: lgm: check format start..[0]
logmon: lgm: format start
ogmon: lgm: FORMAT END START
logmon: (scan) TEST01 offset 0
logmon: [test_command_output] In WithI18n section [ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'],[ERCPY],[ISO-8859-1],[-1]
logmon: [test_command_output] NO MATCH [TEST01] offset now 0
logmon: No Match Found (Return Code : -1)
5. The script was returning an error when trying to connect to MySQL, and that was why when executing->
if /home/dbuser/file2.pl |grep -i "OK"; then echo "OK";else echo "CRITICAL";fi'
it would never find the word OK, hence it would return always CRTICAL and the alarm would then be generated.