Planning a restoration process for Aria Automation to prepare for unforeseen issues
Advice on how to minimize downtime in the event of a disaster
Environment
VMware Aria Automation 8.x
VMware Aria Orchestrator 8.x
Resolution
It's difficult to give advice that would cover all scenarios, but here is some general advice for ensuring continued service:
Always make sure you have backups of all production machines. Snapshots and storage redundancy are not backups. Test restoring your backups to make sure this works correctly.
This is general advice, not just for VMware / Broadcom products.
We do have this guide in the docs using Site Recovery Manager (SRM) as an example:
As best practice, the secondary nodes should be backed up before the primary and all within 40s of each other (as close to simultaneous as possible). For more information please see this article:
For production-down issues, an active support contract entitles you to raise a P1 case and we will respond within 30 minutes to restore service as soon as possible
If Aria Automation has some issue which can't be easily resolved, the most general advice is to run the following script once on any node.
Please note that this script tears down the entire system on all nodes and then builds it back up again. It takes about 30 minutes to complete and the GUI will be down for this time.