You are looking for information on using the icdx_service_monitor.py script for monitoring and alerting of ICDx services.
The original script is found in the following directory:
$SYMC_HOME/tools
Python 3 with the following dependencies:
import os
import sys
import time
import math
import getopt
import smtplib
import logging
import logging.handlers
import requests
from urllib3.exceptions import InsecureRequestWarning
The ICDx service monitor script can help you ensure the ICDx collectors and forwards are running and can alert you if they halt for some reason.
The ICDx help output:
./icdx_service_monitor.py
Name:
ICDxServiceMonitor.py - view/monitor configured and running services
Synopsis:
ICDxServiceMonitor.py [OPTION]...
Description:
View configured and running services. Monitor running services for
halting.
-l, --list [all|collector|forwarder]
all: list all configured services
collector: list configured collector services
forwarder: list configured forwarder services
-r, --running [<uuid>|all|collector|forwarder]
<uuid>: print service info if service with uuid is running
all: print service info of all running services
collector: print service info of all running collectors
forwarder: print service info of all running forwarders
-w, --watch [<uuid>|list|start]
<uuid>: Add/remove the service defined by uuid to the watch list
list: Display the currently defined list
start: monitor services (send to background with & to keep process running)
-h, --help
Print this help information
To use the ICDx service monitor script, you should first copy the script to a user directory, then configure the copy for your environment (using a text editor of choice, which is Linux nano for this example):
cp $SYMC_HOME/tools/icdx_services_monitor.py ./
nano ./icdx_service_monitor.py
The available configuration options are documented within the script. The options that must be changed/confirmed are:
# Domain and port used in ICDx access URL
icdxDomain = "localhost"
icdxPort = "443"
# Protocol used, either "http" or "https" ("https" by default)
icdxProtocol = "https"
# Key generated from ICDx for API access
apiKey = ""
Information on the API key can be found here: Managing ICDx API Keys
By default the script will log to the local syslog. If you would like email alerts, the ICDx server will need to be able to communicate directly with a mail server that accepts its connection and you will need to modify the following configuration in the script:
# Email information for failure alerts
# Note: alert mail will not be attempted if mailHost value is empty
mailHost = ''
mailPort = 25
mailFrom = '[email protected]'
rcptTo = [ '[email protected]' ]
subject = 'ICDx Service Alert Monitor'
The email address format can be either an email address alone, or use the 'Display Name <[email protected]>' format.
Once the configuration has been updated and saved, you can start to use the script. The ICDx service monitor script uses the UUID of the collector or forwarder to watch the service. You can obtain the UUIDs of the services in the ICDx web interface, or by using the script itself.
To show the UUIDs of all the running services:
./icdx_service_monitor.py --running all
If you do not want to list "all" services, you can also specify "collector" or "forwarder" to display just those running services.
You can also list all configured services, if you need a UUID of a service that is not yet running:
./icdx_service_monitor.py --list all
As before, you can use "collector" or "forwarder" rather than "all".
Now that you know how to find the ICDx service UUIDs, you add them to the services monitored by the ICDx service monitor script with the "--watch" argument:
./icdx_service_monitor.py --watch <UUID>
The --watch option will add the UUID to the saved list or remove it, depending on whether or not it already exists. You can view what UUIDs are currently in the list with the "list" arguement to "--watch":
./icdx_service_monitor.py --watch list
When you have completed the list of service UUIDs to watch, you can start the watch process to continually monitor those services. The script is not daemonized, so you will need to manually run the script in the background (with the & symbol):
./icdx_service_monitor.py --watch start &
To stop the running script, you can use the Linux killall command:
killall icdx_service_monitor.py
The script outputs information to syslog depending on its log level. The log level is Informational by default. The output looks like the following:
<date> <hostname> [INFO] icdx_service_monitor.py:watchMonitor:433 "Starting ICDxServiceMonitor monitoring process"
<date> <hostname> [INFO] icdx_service_monitor.py:watchMonitor:466 "Running Services monitor is empty, populating list..."
<date> <hostname> [WARNING] icdx_service_monitor.py:monitorAlert:299 "7f229b90-6bb1-11eb-edd2-000000000003 (my Email .cloud collector) not running, attempting to restart (1/8)"
<date> <hostname> [WARNING] icdx_service_monitor.py:monitorAlert:299 "ec8b9990-a2ba-11ea-cd9d-000000000001 (my CloudSOC collector) not running, attempting to restart (1/8)"
<date> <hostname> [WARNING] icdx_service_monitor.py:monitorAlert:299 "7f229b90-6bb1-11eb-edd2-000000000003 (my Email .cloud) restarted after 1 attempts, resetting counters."
<date> <hostname> [WARNING] icdx_service_monitor.py:monitorAlert:299 "ec8b9990-a2ba-11ea-cd9d-000000000001 (my CloudSOC collector) not running, attempting to restart (2/8)"
<date> <hostname> [WARNING] icdx_service_monitor.py:monitorAlert:299 "ec8b9990-a2ba-11ea-cd9d-000000000001 (my CloudSOC collector) restarted after 1 attempts, resetting counters."
If the email settings are successfully configured, the email address recipient will receive similar messages via email along with the syslog entries.