What appears to be erratic or halted data collection can be due to problems with either the analytics or collector process running on a Aria Operations node. To troubleshoot issues related to these processes, log in to the vCenter Server and perform the following checks related to the infrastructure of your environment.
- Check the read/write latency for the Datastore objects that are connected to Aria Operations VMs and ensure that values are under 10 milliseconds on average. Higher latency times might slow the data processing or storage rate and cause data collection to become erratic. To address this problem, consider moving to faster storage.
- Check the CPU and memory usage commitment for the hosts on which Aria Operations VMs are deployed. If the the CPU or memory usage on any host is over-committed, verify that CPU ready, memory ballooning metrics are acceptable. To accommodate the CPU usage pattern of Aria Operations, the ideal ratio of host CPU to virtual CPU is one to one.
- Verify that disks for the Aria Operations VMs do not contain any snapshots. Unneeded snapshots can affect I/O performance and result in slower data processing or storage rates.
- Verify that the load average that is displayed from the top command does not exceed the number of CPUs allocated for the Aria Operations VM. A higher load average number indicates high CPU contention.
- To verify that the there are no network issues, check the metrics for packets dropped by the Aria Operations VMs.
Always verify that your Aria Operations configuration conforms to the sizing guidelines.
If you do not discover why data collection has stopped after performing all checks, contact VMware support.