Step 1: Validate Cluster Imbalance
Perform the following checks to identify imbalance and potential FSDB corruption:
A: Check Metrics
- Log in to the Primary Node Admin UI as admin.
- Navigate to the cluster metrics dashboard.
- Ensure Objects in Process and Metrics in Process are roughly equal across all nodes.
B: Check Disk Usage- SSH into each node as
root.
- Run:
df -h
- Validate that the /storage/db partition usage is consistent across nodes.
Step 2: Detect Corrupted FSDB Files
- SSH into each node as root.
- Run:
find /storage/db/vcops/data/ -type f -not -regex '.*/[2][0][2][0-9]_[0-9][0-9]_.*.dat' -and ! -regex '.*/[2][0][1][7-9]_[0-9][0-9]_.*.dat' -and ! -name '*dtr' -and ! -name 'mps_*'

- A corrupted FSDB file might look like:
An example of a corrupted file is as follows:
/usr/lib/vmware-vcops/data/01/01234/12345678_10_01234.dat.
Note the future timestamp at the end of the file in bold.
An example of a valid file is as follows:
/usr/lib/vmware-vcops/data/01/01234/2025_10_01234.dat.
Note the relatable timestamp in bold.
Step 3: Remediate Corruption and RebalanceIf corrupted files are found:
- Log in to the Primary Node Admin UI.
- Take the Cluster OFFLINE via Admin UI.
- Power off all Analytics VMs from the vSphere Client.
- Take a
SNAPSHOT of all nodes for backup.
- Power on all Analytics VMs from the vSphere Client.
- Manually delete the identified corrupted
.dat and related
.cache files from affected nodes.
Refer:
Corrupted FSDB files due to unrealistic timestamp - Aria Operations- Bring the Cluster ONLINE from Admin UI.
- Allow some time for the system to auto-rebalance.
Step 4: Manual Rebalancing (if auto-rebalance does not work)
If cluster balance is not restored automatically
- Log in to the Primary Node Product UI as
admin.
- Navigate to:
Administration > Control Panel > Cluster Management- Expand
Actions and click
Rebalance.

- Ensure to select the Rebalance Disk Option.

- Wait up to 24 hours for the rebalancing process to complete.
- Re-validate that following are now consistent across all nodes.:
- /storage/db usage
- Objects in Process
- Metrics in Process