Failure in Aria Automation Snapshot Creation in Aria Suite Lifecycle due to Datastore Validation and Health Check Issues
book
Article ID: 381828
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
In VMware Aria Automation, snapshot creation from Aria Suite Lifecycle fails during the "Product health check." The health check shows that one of the Aria Automation nodes, is not in a healthy state, preventing successful snapshot creation.
While the pre-check for datastore validation passes, the overall snapshot pre-check health check fails due to reported issues with Aria Automation health on a specific node.
Below messages were seen in the Precheck Report:
VMware Aria Automation node xxxxxx.xxxxx.xxx health is not ok.
Please make sure that VMware Aria Automation is in healthy state.
Below events were seen in /var/log/vrlcm/vmware_vrlcm.log file:
2024-11-08 06:25:30.642 INFO [http-nio-8080-exec-7] c.v.v.l.r.u.GenericPrevalidationReportCreator - -- Message id: SNAPSHOT_PRE_CHECK_HEALTH_CHECK, parameters: [ ], default: Product health check
2024-11-08 06:25:30.642 INFO [http-nio-8080-exec-7] c.v.v.l.r.u.GenericPrevalidationReportCreator - -- Translated message Product health check from key SNAPSHOT_PRE_CHECK_HEALTH_CHECK with params [ ]
2024-11-08 06:25:30.642 INFO [http-nio-8080-exec-7] c.v.v.l.r.u.GenericPrevalidationReportCreator - -- LOG Validation element: { "id" : null, "checkName" : "VMware Aria Automation health check", "checkType" : "ERROR", "status" : "FAILED", "recommendations" : [ "Please make sure that VMware Aria Automation is in healthy state." ], "resultDescription" : "VMware Aria Automation node xxxxxx.xxxxx.xxx health is not ok.", "elementType" : "CHECK", "childElements" : null, "localizedCheckNameId" : null, "localizedRecommendationsIds" : null, "recommendationParams" : null, "localizedDescriptionId" : null, "descriptionParameter" : null }
2024-11-08 06:25:30.643 INFO [http-nio-8080-exec-7] c.v.v.l.r.u.GenericPrevalidationReportCreator - -- Message id: null, parameters: [ ], default: VMware Aria Automation health check
2024-11-08 06:25:30.643 INFO [http-nio-8080-exec-7] c.v.v.l.r.u.GenericPrevalidationReportCreator - -- Message id: null, parameters: null, default: VMware Aria Automation node xxxxxx.xxxxx.xxx health is not ok.
The snapshot creation failure is due to a Product health check error on a specific Aria Automation node, specifically related to datastore validation and disk space issues. Limited free space in the /home partition on the affected node is causing the Product health check to fail.
Resolution
Verify Health of Aria Automation Components:
Confirm whether Inventory Sync for Aria Automation is successful in Aria Suite Lifecycle. Typically, Inventory Sync will complete without any issues.
SSH into Aria Automation node and run the following commands to verify that all nodes and pods are healthy:
kubectl get nodes
kubectl get pods -n prelude
vracli service status
Remove Old Snapshots and Unneeded Files:
Attempt to delete any old snapshots via Aria Suite Lifecycle and clear unnecessary log bundles from /home partition on the reported node.
Verify Disk Usage:
Use vracli disk-mgr command on each node to check free space. Logs from Aria Suite Lifecycle point to limited space on the /home partition (18% free space) in the below e.g.: (Typically, free space on all partitions in Aria Automation nodes should be more than 25%)
Once insufficient disk space is confirmed, identify and remove unnecessary files (e.g., older log bundles) to free up space from the /home directory.
Run Disk Manager and Snapshot Validation:
Run vracli disk-mgr command on all three nodes and verify output to ensure datastores meet requirements.
Once Aria Automation health is addressed and sufficient disk space is available, rerun the snapshot pre-check validation within Aria Suite Lifecycle.
This time the Snapshot Validation should be successful.