Monitoring Aria Automation image based backups for proper quiesced state
Log Message extraction using various methods
Aria Automation 8.x
Introduction
Guidelines for backing up VMware Aria Automation using image/snapshot based backup system.
Note: Main purpose of this document is to validate that the snapshots are synchronized across all the nodes. Currently snapshots will be created successfully even if snaps were not done within the time limit of ~40 seconds
Link to documentation VMware Aria Automation 8.x Preparations for Backing Up
Image above shows high level diagram for Aria Automation snapshot quiesce backup
The Aria Automation Appliance has VMware Tools installed and configured with “vmbackup” enabled and is required to execute freeze scripts. Additional information can be found in these documentation Links VMware Tools Services , Exclude Specific File Systems from Quiesced Snapshots and Enabling Quiescing for Linux VMs
Aria Automation VMware Tools Configuration and Freeze scripts
[vmbackup]
execScripts=true
Note: With this enabled any scripts in the directory /etc/vmware-tools/backupScripts.d will be invoked.
2. On the Aria Appliances the out of the box script is
90-freeze-data -> /opt/scripts/freezer-control.py
Note: The freezer-control.py then will call freezer-server.py which will validate the snapshot consistency across all nodes. If all nodes freeze within the allotted time then the script returns 0 for success and a snapshot with quiesce will execute. If Sync fails then the script will return an exit code of 0 and the freeze will end and the snapshot will still be taken. Currently the script only logs errors and does not return a failed freeze when a “failed sync” occurs, however the errors are recorded in the journal logs. In this document, we will show how to extract these logs from the appliances or logging system.
Simple flow diagram showing if the freeze script exits with 0 or 1.
The images Below show timelines of real examples of snapshot execution for Aria Automation nodes.
Validating Aria Automation Freeze Sync
There are a few ways to get the logging information from Aria Automation appliances either directly from the appliances journal logs or from a logging application like VMware Log Insights where the logs are forwarded.
“journalctl --identifier=vmtoolsd --no-pager --output=json --since='YYYY-MM-DD HH:MM:SS’ “
Note: The only parameter we are passing into this command is the since date which pulls the data from that start time till current time. The returned message are then searched for the messages:
freeze synchronization failed
sync failed, making inconsistent snapshot
Note: These scripts then can be modified and plugged into a backup solution to provide a retry method or mark the backup image as failed sync and can be found in the zip file Query for Aria Automation Journal logs examples.zip
Guest Scripting (Powershell, Python & vRO)
Note: These are example scripts to extract the log message directly from the Aria automation appliances
This script is written in python and uses the pypi pyvmomi module to connect to the vCenter and then StartProgramInGuest executes command line above, then extracts the required logs to evaluate in the script.
This script is written in powershell and uses the module powercli to connect to the vCenter and then execute Invoke-VMScript
This script is a vRO workflow and uses the Guest Script Manager workflow “Run Script in Guest”
args = {
"host": "vcenterhost", # vCenter FQDN
"user": "username@domain", # vCenter Username
"password": "password", # vCenter Password
"port": 443, # vCenter Port
"disable_ssl_verification": True, # for self signed certs
"guestUsername": "root", # guest vm username
"guestPassword": "password", # guest vm password
"vmname": "vRA vm name", # vRA vm name
"startDate": "YYYY-MM-DD HH:MM:SS" # Date/time snapshot was invoked
}
# vCenter FQDN
$vcHost = ""
# $vcHost = $args[0]
# vCenter Username
$vcUsername = ""
# $vcUsername = $args[1]
# vCenter Password
$vcPassword = ""
# $vcPassword = $args[2]
# Aria VM name
$vraHost = ""
# $vraHost = $args[3]
# Aria VM Guest Username
$guestUser = ""
# $guestUser = $args[4]
# Aria VM Guest Password
$guestPassword = ""
# $guestPassword = $args[5]
# Start Time format structure "YYYY-MM-DD HH:MM:SS"
# This should be the time snapshot started
$startTime = ""
# $startTime = $args[6]
Also the vCenter Server needs to be added to vRO using the “Add a vCenter Server instance” workflow.
and have it show up in the inventory section.
The script to invoke the query requires two inputs vmname and startdate.
Log Insight
Aria Automation Integration
You will need to ssh into one of the Aria Automation nodes then to validate Log Insight integration at the command prompt
vracli vrli
Output Example:
$ vracli vrli
{
"agentId": "0",
"environment": "prod",
"host": "fqdn.xxx.local",
"port": 9543,
"scheme": "https",
"sslVerify": false
}
If the command does not return a configuration then here is a link to the Documentation to configure: Configure log forwarding to Log Insight
Direct API Scripting query into Log Insight
Currently, the getvRALogsFromLi.py script has a section in the main function where the credentials and parameters are passed in. Examples of the values that need to be replaced are below. I have not yet turned this into a command line script, and was not sure where this script would be executed. This script uses a python request module to talk to Aria Log Insight to extract logs for a given vranode and a date range. Was not able to find a way to extract the data without authentication so you will need to assign a user permissions to authenticate with Log Insight most likely using vIDM as the provider.
args = {
# Log Insight Host & Port you will need to identify which log insight host the vRA nodes are pointed to
"host": "fqdn.com",
"port": 9543,
# Log Insight Service account & Password for backup service
"username": "<username>",
"password": "<password>",
"provider": "Local", # The provider can be "Local","ActiveDirectory" or "vIDM"
# Aria Automation Node
"vranode": "fqdn.local",
# Date Range to search
"starttime_str": "YYYY-MM-DD HH:MM:SS",
"endtime_str": "YYYY-MM-DD HH:MM:SS"
}
Troubleshooting log messages
SSH into one of the vRA nodes with root access
{
"agentId": "0",
"bufferFlushThreadCount": 16,
"environment": "cavadev",
"host": "fqdn.com",
"port": 9543,
"requestHttpCompress": false,
"requestImmediateRetries": 3,
"requestMaxSize": 256000,
"requestTimeout": 30,
"scheme": "https",
"sslVerify": false
}
"journalctl --identifier=vmtoolsd --no-pager -f"This is basically tailing the journalctl logs
a. Go to the explore logs tab and setup a filter as shown changing the hostname to your vRA node.
Example: of a successful Freeze Sync:
Example: of a failed Freeze Sync: