1. When trying to access the Aria Operations for Logs UIs, you get the "This site can't be reached" error.
2. df -h output in the nodes' SSH sessions show /storage/core 100% full.
3. You observed large buckets under /storage/core/loginsight/cidata/store by running the following command:
du -hscx * 2>/dev/null | sort -h
3. In ui-runtime.log, you see WARN log traces with the error "Insufficient disk space for compaction..."
Aria Operations for Logs 8.18.x
/storage/core is at 100% full on one or more nodes due to large buckets.
Workaround:
1. SSH into each node as root user and run the following command to verify which node has 100% /storage/core.
df -h /storage/core
2. Run the following command to list buckets size with the largest listed at the bottom.
du -hscx /storage/core/loginsight/cidata/store/* 2>/dev/null | sort -h
3. Change directory to /usr/lib/loginsight/application/sbin/.
cd /usr/lib/loginsight/application/sbin/
4. Run the following command to manually delete buckets by 100 at a time and checking the usage size of /storage/core.
./bucket-tools --delete oldBucketsCount=100
df -h /storage/core
Note: The above command is to remove the oldest 100 buckets. You can increase the number accordingly if the progress is too slow after 5+ actions. The script will also stop the loginsight service and prompt you to verify the command. Hit y to continue.
5. Keep executing step 4 until /storage/core is <= 97%.
6. Restart loginsight service on all nodes by running the following command and the UIs should be accessible after this step.
service loginsight restart
1. 97% /storage/core is normal. See more details in /storage/core partition is running out of available disk space on an Aria Operations for Logs virtual appliance.
2. After the UIs are back up, make sure the environment is sized correctly by following Sizing the VMware Aria Operations for Logs Virtual Appliance, which includes KB 332393 to scale up to XL, XXL, and XXXL sizes. You can Add a Worker Node to a VMware Aria Operations for Logs Cluster if needed. Make sure you are following VMware Aria Operations for Logs Configuration Limits.
3. If 100% /storage/core switch to another node after the workaround, go to the UI, then go to Management > System Monitor > Statistics > Syslog Events Incoming Rate (Per Second). Verify if a significantly unbalanced ingestion rate is shown among the nodes. Verify if Cisco ACI or NSX DFW exists in your environment that conflicts with the load balancer or VIP in Aria Operations for Logs. Please refer to the following KBs for more details:
Aria Operations for Logs load balancer incompatible with NSX Distributed Firewall Protection
Cisco ACI based environment conflicting with the VIP in Aria Operations for Logs