I have to execute a command using the logmon probe and check if a particular keyword/string is NOT found in the command output. If the keyword/string is NOT found, I need to generate an alert.
- Customer wanted to use logmon to check if a process/processes were running or not and if not, generate an alarm.
Here is an example of how you can test running a command on Linux/UNIX systems, parse the output, then generate an alarm when a process/string is not found in the command output.
You can use cron or some other existing command if it's running.
Once you test the first scenario (process is running and found), you can kill the process to force the second watcher to kick in and generate the alarm when the process is NOTpresent, hence the process name/string was not found.
On the robot where logmon is deployed,
vi test.sh and enter:
#!/bin/bash
if ps -e|grep "cron"
then
echo "Found cron process"
else
echo "cron Not Found"
fi
Set the logmon command to-> /opt/nimsoft/probes/system/logmon/test.sh
Set the Watcher_1 to the good Match result (process running):
Match expression-> /.*Found cron process.*/
Set Watcher_2 to the bad Match result (process not running):
Match expression-> /.*cron Not Found.*/
Set the severity to warning (just an example).
Sample alarms from this test:
Alternate approach/option
Note that if you need to deploy this type of profile on a large number of robots, you don't have to copy the script, you can just enter the entire command in the logmon command window instead, for example, use semi-colons in between each script line.
Then, once you create the logmon profile, you can create a logmon configuration package to distribute the profile section to other robots where logmon is deployed.
if ps -e|grep "cron";then echo "Found cron process";else echo "cron Not Found";fi
Lastly, Note that rt-click on the logmon profile to test it-> 'Test profile' produces 'No Results' so you cannot use that to test.
You can tell the profile is working by the generated alarm(s) and/or check the logmon probe log at loglevel 3 or 5. Use a logsize of 100000.
Here is an example of the logmon.cfx file for a logmon probe configuration package that could be distributed. This is just an example from this particular test setup.
<profiles> overwrite
<Test command_2> overwrite
active = yes
interval = 30 sec
scanfile = if ps -e|grep "cron";then echo "Found cron process";else echo "cron Not Found";fi
fileencoding =
scanmode = command
alarm = yes
qos = yes
message = no
subject =
user =
reccur_directory = no
reccur_directory_level = 10
resetFile = no
initialfileptr = 2
resumefileptr = 4
command_timeout_active = no
command_timeout =
command_severity = 2
command_timeout_alarm = 0
alarmFOpenFail = no
clearFOpenFailRestart = no
monitor_exit_code = No
max_alarm_sev = 5
max_alarms =
max_alarm_msg =
password =
<watchers> overwrite
<Watcher_1> overwrite
active = yes
match = /.*Found cron process.*/
level = information
subsystemid =
message =
i18n_token =
restrict =
expect = no
abort = no
sendclear = no
count = no
separator =
suppid =
source =
target =
qos =
runcommandonmatch = no
alarm_on_first_match = no
commandexecutable =
commandarguments =
pattern_threshold_severity = information
pattern_threshold_message =
timeout = 1
pattern_threshold =
expect_message =
expect_level =
regexfromexternalfile = no
patternfilepath =
token =
variable_threshold =
variable_threshold_message =
variable_threshold_severity = information
variable_threshold_supp =
</Watcher_1>
<Watcher_2> overwrite
active = yes
match = /.*cron Not Found.*/
level = warning
subsystemid =
message =
i18n_token =
restrict =
expect = no
abort = no
sendclear = no
count = no
separator =
suppid =
source =
target =
qos =
runcommandonmatch = no
alarm_on_first_match = no
commandexecutable =
commandarguments =
pattern_threshold_severity = information
pattern_threshold_message =
timeout = 1
pattern_threshold =
expect_message =
expect_level =
regexfromexternalfile = no
patternfilepath =
token =
variable_threshold =
variable_threshold_message =
variable_threshold_severity = information
variable_threshold_supp =
</Watcher_2>
</watchers>
</Test command_2>
</profiles>
Additional Tips/Troubleshooting
You must specify the full path to the command, so for instance instead of the current command:
if ps -e|grep "cron";then echo "Found cron process";else echo "cron Not Found";fi
Change the command to:
(if /usr/bin/podman ps -e|grep "nova_compute";then echo "Found podman process";else echo "podman Not Found";fi)
This produced a "Found" alarm message because the podman process was up and running.
If the command or script is more complicated/extensive, and/or you'd rather not include the entire command in the logmon command window, and/or you need to deploy the profile to multiple robots, you can create a logmon configuration package and include a script file and target location for the script file.