We are performing maintenance on the database (MSSQL/MySQL/Oracle) server that hosts the DX UIM database.
The server will be offline for an extended period of time.
What is the best practice for managing the DX UIM environment during this maintenance window?
Environment
DX UIM - Any version
Database server
data_engine
Primary and secondary hubs
Maintenance window
Cause
Environment Maintenance
Resolution
Primary Hub Maintenance Window options
If the DX UIM database server will be offline for an extended period of time, you may choose to also shut down the DX UIM Primary Hub, or you may choose to leave it running. Which one you choose depends on the specifics of your environment and the maintenance being performed, but either choice is valid/acceptable.
Considerations for each scenario are as follows:
If the primary hub is shut down, the QoS Data and alarms will be queued up on remote hubs (or on the robots connected to the primary hub itself) waiting for the hub to become available.
This means you will want to ensure that each remote hub has sufficient disk space to store the data collected from its robots. It also means that no alerts will be processed, no NAS auto-operators will be executed, no integrations will be operational, (e.g. with other products or ticketing systems) or functional - the DX UIM Environment will be effectively down.
Once the maintenance is complete and the primary hub brought back online, the queued QOS data will be retrieved/processed from the remote hubs through the defined queues, and all queued alerts will be processed.
It may be advisable to disable any auto-ticketing or similar integrations/automations until the flood of alerts is processed, as many of the alerts may be out of date.
If the primary hub is left up and running, the alarms will still be processed by NAS, but not inserted into the database, and the data will queue up on the primary hub waiting for the database to become available.
This means you will want to ensure that the primary hub has sufficient disk space to store the data collected from all the other hubs during the maintenance window.
Alerts will not be displayed in Operator Console (and in fact Operator Console will likely be down) but NAS can still process them, execute auto-operators, send email alerts, etc.
Some critical alerts will likely be generated by the data_engine probe regarding the inability to insert data to the database.