Monitor and validate the status of Linux server services
search cancel

Monitor and validate the status of Linux server services

book

Article ID: 128018

calendar_today

Updated On: 10-02-2023

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

Customers may want to monitor Linux server services that do not have a dedicated underlying process.

Environment

- UIM 8.51 or higher

Cause

- Guidance on how to use DX UIM to monitor and validate the status of Linux server services

Resolution

Options include the use of one of the following probes:

1. processes

Use the processes probe to monitor the underlying process for a given service if it exists. For instance you can monitor the HTTP service, e.g., httpd. 

2. rsp

To retrieve remote data, the rsp probe uses commands on UNIX/Linux systems on SSH, using port 22.
Note: The probe supports only password-based and key-based authentication. Keyboard-Interactive and authentication-less methods are not supported. If the UNIX-based remote server is not password-based or key-based authentication that is enabled, the rsp probe is unable to discover the remote host. 

3. logmon

Use logmon to run a command, e.g., service iptables status, and then parse the output, generate QOS/alarms, etc. Use separate Watchers to monitor each Linux service.

You must have access/permissions to run the command and specify the full path to the command.

To check a service's status on Linux, use the <systemctl status service-name> command.

Here is an example and output from running a command to check for a Linux service that exists/is up and running:

# systemctl status sshd

● sshd.service - OpenSSH server daemon

   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)

   Active: active (running) since Thu 2022-12-01 19:57:42 UTC; 49s ago

     Docs: man:sshd(8)

           man:sshd_config(5)

 Main PID: 1185 (sshd)

   CGroup: /system.slice/sshd.service

           └─1185 /usr/sbin/sshd -D

 

Dec 01 19:57:41 abcd-host systemd[1]: Starting OpenSSH server daemon...

Dec 01 19:57:42 abcd-host systemd[1]: Unit sshd.service cannot be reloaded because it is inactive.

Dec 01 19:57:42 abcd-host sshd[1185]: Server listening on 0.0.0.0 port 22.

Dec 01 19:57:42 abcd-host sshd[1185]: Server listening on :: port 22.

Dec 01 19:57:53 abcd-host sshd[1208]: Accepted password for root from 10.xxx.xxx.xx port 55183 ssh2

Dec 01 19:57:54 abcd-host sshd[1213]: Accepted password for root from 10.xxx.xxx.xx port 55187 ssh2


Here is an example and output from running a command to check for a Linux service that does NOT exist/is not up and running:

# systemctl status iptables

Unit iptables.service could not be found.

[root@abcd-host ~]# systemctl status firewalld

● firewalld.service - firewalld - dynamic firewall daemon

   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)

   Active: active (running) since Thu 2022-12-01 19:57:38 UTC; 1min 42s ago

     Docs: man:firewalld(1)

 Main PID: 472 (firewalld)

   CGroup: /system.slice/firewalld.service

           └─472 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid

 

Dec 01 19:57:37 abcd-host systemd[1]: Starting firewalld - dynamic firewall daemon...

Dec 01 19:57:38 abcd-host systemd[1]: Started firewalld - dynamic firewall daemon.

Dec 01 19:57:38 abcd-host firewalld[472]: WARNING: AllowZoneDrifting is enabled. This is considered an insecure configuration option. It will be removed in a future release. Please c...bling it now.

Hint: Some lines were ellipsized, use -l to show in full.


In logmon, you will need to enter the full path to the command, e.g.,

    /usr/bin/systemctl status sshd

 

You can parse the command output using a Watcher profile and regex. Regex format example: /.*<string>.*/

The valid Linux service status states could be 'loaded,' 'active' and/or 'plugged.'

So for example, if you wanted to monitor if the service was active and running you could try/test a Watcher regex such as: 

/.*active \(running\).*/


or use an
AND operator such as:

/(.*active.*)(.*running.*)/

Additional Information

How to run a logmon command parse output and if keyword/string is NOT found generate alarm
https://knowledge.broadcom.com/external/article?articleId=252610