Aria/VCF Operations Cluster Stuck in "Running/Waiting for Analytics" State Due to Full Root Partition
search cancel

Aria/VCF Operations Cluster Stuck in "Running/Waiting for Analytics" State Due to Full Root Partition

book

Article ID: 410759

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

In certain scenarios, the Aria/VCF Operations cluster may enter a "Running/Waiting for Analytics" state and fail to come online fully. Upon investigation, this issue was traced back to the root partition being completely filled, preventing cluster to come Online.

Environment

Aria Operations 8.x
VCF Operations 9.x

Cause

Upon running the df -h command, it was found that the root (/) partition was 100% full. To identify which directories were consuming excessive space, the command du -sh * was executed within the root partition.
This revealed that the /var/tmp/alerts directory was rapidly growing in size. Further investigation showed that a Log File plugin had been configured in Outbound Settings within Aria/VCF Operations, which was generating large volumes of alert logs stored under /var/tmp/alerts.

Resolution

To resolve the issue and restore cluster functionality, the following steps were taken:
  1. Free Up Root Partition Space:
    • SSH to the affected Analytics node
    • Move the /var/tmp/alerts folder to a location with more available disk space using:
      mv /var/tmp/alerts /storage/db/
  2. Restart Cluster:
    • Log into the Admin UI of the primary node as admin (https://<Primary_Node_FQDN>/admin)
    • Bring the cluster offline.
    • Once Offline, bring the cluster online
  3. Update Log File Plugin Configuration:
    • Log into the Product UI as admin (https://<Primary_Node_FQDN>/ui)
    • In the Left Panel, navigate to Configuration > Outbound Settings.
    • Edit the respective Log File plugin configuration to store logs in a new directory:
      /storage/db/alerts

         
      This ensures that future alerts will be stored in a directory with adequate space, preventing recurrence of this issue.

Additional Information