Implications of incorrect sizing in Aria/vRealize Operations
search cancel

Implications of incorrect sizing in Aria/vRealize Operations

book

Article ID: 370132

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

This article provides information on the importance of keeping Aria operations appliances correctly sized, to prevent cluster instability and other issues that may occur.

  • Analytics nodes consist of a primary node, a primary replica node, and data nodes.
  • The analytics cluster must be sized based on the size of the environment.
  • Cloud proxies are not considered part of the analytics clusters as they do not participate in any type of data calculation, or processing.

If the sizing guideline provides several configurations for the same number of objects, use the configuration that has the least number of nodes. For example, if the number of collecting is
120,000, configure the cluster with four extra-large nodes instead of 12 large nodes.  

Aria Operations has a high CPU requirement. In general, the more physical CPU that you assign to the analytics cluster, the better the performance. The cluster will perform better if the
nodes stay within a single socket.  

Analytics nodes, witness nodes, and Cloud Proxies have various hardware requirements for virtual machines and physical machines.

Environment

VMware Aria Operations 8.x
VMware vRealize Operations 8.x
VMware vRealize Operations Manager 7.x
VMware vRealize Operations Manager 6.x

Cause

Possible implications of having an incorrectly sized Aria/vRealize Operations deployment, are as follows:

  • Slow user interface performance.
  • Cluster services crashing and restarting.
  • Missing adapter collection cycles, leading to gaps in collected data.
  • Adapters stopping collection.
  • Report generation failing.
  • Missing data in reports
  • Upgrade issues.
  • Problems expanding the cluster.
  • Problems installing new management packs.
  • The high load of resources collecting in the cluster when its undersized could be causing other side-effects like the cluster being unable to connect to platform services or nodes crashing and generating heap dumps.  It can affect any other feature of the product.
  • When the cluster nodes are out of disk space, Aria/vRealize Operations (as of version 6.6.x) automatically stops the cluster to prevent database degradation. 

Resolution

Review the below documentation before any changes are made in the Aria Operations

Based on these above you might need to add more node nodes, or simply resize the analytics node(s)

Note:

When resizing Aria operations nodes you must bring the cluster offline, so please ensure the you bring the cluster offline from Admin UI

Before any changes to made in the Aria Operations cluster please take snapshots, please use a reference KB How to take a Snapshot of VMware Aria Operations

If you are using Aria lifecycle manager, you can follow up the steps from Scale up VMware Aria Suite products

 

The main consideration for sizing is CPU and memory allocated to each analytics node. 

  • Ensure that all analytics nodes have the same amount of CPU and memory allocated to VMs
  • Note that HA/CA configurations halve the amount of total metrics/objects that are supported due to replication of data
  • For a HA/CA cluster use the following calculation

    ( <Total nodes in cluster> * <Multi-Node Maximum objects per node> ) / 2

  • For non-HA/CA cluster use the following calculation

    <Total nodes in cluster> * <Multi-Node Maximum objects per node> 

  • For single node deployment use value provided under Single-Node Maximum objects
  • We mainly focus on objects rather than metrics, as you are much likely to be undersized due to object count

It's very important to ensure that cluster is not undersized to ensure stability, some of the symptoms that are frequently seen due to incorrect sizing are listed under Cause above

 

The main purpose of the analytics nodes are to perform data processing. However, they can also collect data using adapters/integrations. In some cases it is beneficial to offload the collection to Cloud Proxies rather than collect data using analytics nodes. This will free up the memory and CPU cycles to handle the data processing, particularly if you are nearing the upper of the current sizing.

Additional Information

Information about VMware Aria Operations' maximum supported nodes in the analytics cluster and other information related to High Availability can be found in the Sizing Guideline KB.