Recommended environment stop/start order for OS patching Performance Management environment

book

Article ID: 144962

calendar_today

Updated On:

Products

CA Infrastructure Management CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

I’m writing a doc to update our PM servers for Linux Monthly Patching schedules. We have the ability to have the servers staged to run in sequence of patching, and I was looking around to see if there is a best practice.

I want to ensure that I plan that the DB and DA and Console servers are staged properly. The DC's really don’t have a requirement of sequence, but I plan to do one 1 DC (per Data center) per stage.

What is the recommended order to bring the environment down and back up for regularly scheduled system patching at the OS level?

Environment

All supported Performance Management releases

Resolution

It is recommended that we follow the same path an upgrade follows to ensure the servers are stopped, patched and started in the correct order.

This path helps minimize data loss on the Data Collector servers following this path.

Ensure successful database backups of the Data Repository and Performance Center MySql netqosportal and em databases are safely set aside before performing any patching. Without these we're at risk of unrecoverable failure should something go wrong with the server during patching.

The recommended order that follows the recommended upgrade path is as follows.
  1. Stop the Data Aggregator as required before stopping the Data Repository database. Stop the Data Repository database. Patch the Data Repository cluster node(s) and restart the database.
  2. Before restarting the Data Aggregator patch the server. Restart the Data Aggregator.
  3. Stop the four PC server services and Mysql. Patch the server and restart the Performance Center services if not started after a reboot or a reboot isn't performed. Note that this can be done first, in this order, or last. Regardless if the choice, while the Data Aggregator and Data Repository are both down, ensure users are aware they'll be unable to do much of anything in UI until they are both restarted and resynchronized with Performance Center.
  4. Stop the Data Collector(s) one at a time, patch them and restart them. The faster they are restarted, the less the resulting data gap will be in reports.