By default, VMware offers Extra Small, Small, Medium, Large, and Extra Large configurations during installation. You can size the environment according to the existing infrastructure to be monitored. After the VMware Aria Operations instance outgrows the existing size, you must expand the cluster to add nodes of the same size.
VMware Aria Operations Node | Cloud Proxy (RC) | Cloud Proxy (CP) | |||||||
Extra Small | Small | Medium | Large | Extra Large | Standard | Large | Small | Large | |
Objects and Metrics | |||||||||
Single-Node Maximum Objects | 350 | 5,000 | 15,000 | 22,000 | 50,000 | 6,000 (4) | 32,000 (4) | 8,000 (4) | 40,000 (4) |
Single-Node Maximum Collected Metrics (1) | 70,000 | 800,000 | 2,500,000 | 4,000,000 | 10,000,000 | 1,200,000 | 6,500,000 | 1,200,000 | 6,000,000 |
Maximum number of nodes in a cluster | 1 | 2 | 8 | 16 | 12 | 200 | |||
Multi-Node Maximum Objects Per Node | N/A | 3,000 | 8,500 | 18,000 | 44,000 | N/A | |||
Multi-Node Maximum Metrics Per Node | 700,000 | 2,000,000 | 3,000,000 | 7,500,000 | |||||
Maximum number of objects in a cluster | 350 | 6,000 | 68,000 | 288,000 | 528,000 | ||||
Maximum number of metrics in a cluster | 70,000 | 1,400,00 | 16,000,000 | 40,800,000 | 63,000,000 | ||||
Maximum number of objects in an extended cluster (2) | N/A | 6,600 | 74,800 | 316,800 | 580,800 | ||||
Maximum number of metrics in an extended cluster (2) | 1,540,000 | 17,600,000 | 44,880,000 | 69,300,000 | |||||
Configuration | |||||||||
vCPU | 2 | 4 | 8 | 16 | 24 | 2 | 4 | 2 | 4 |
Default Memory (GB) | 8 | 16 | 32 | 48 | 128 | 4 | 16 | 8 | 32 |
Maximum Memory (GB) (2) | N/A | 32 | 64 | 96 | 256 | 8 | 32 | N/A | |
vCPU: physical core ratio for data nodes (3) | 1 vCPU to 1 physical core at scale maximums | ||||||||
Network latency (5) | < 5 ms | < 200 ms | < 500 ms | ||||||
Network latency for agents (to VMware Aria Operations node or RC/CP) (5) | < 20 ms | ||||||||
Network bandwidth (Mbps) (6) | N/A | 25 | 80 | 15 | 60 | ||||
Datastore latency | < 10 ms, with possible occasional peaks up to 15 ms | ||||||||
IOPS | See the Sizing Guide Worksheet for details | ||||||||
Disk Space | See the Sizing Guide Worksheet for details | ||||||||
Other maximums | |||||||||
Maximum number of telegraf agents per node | N/A | 500 | 3,000 | ||||||
Maximum number of vCenter on a single collector | N/A | 25 | 50 | 100 | 120 | 25 | 50 | 25 | 100 |
Maximum number of the Service Discovery objects | 3,000 | ||||||||
Maximum number of concurrent users per node (7) | 10 | N/A | |||||||
Maximum certified number of concurrent users (8) | 300 | ||||||||
Maximum number of concurrent API calls per client | 50 | ||||||||
Maximum number of concurrent API calls per node | 300 |
Continuous Availability (CA) allows the cluster nodes to be stretched across two fault domains, with the ability to experience up to one fault domain failure and to recover without causing cluster downtime. CA requires an equal number of nodes in each fault domain and a witness node, in a third site, to monitor split brain scenarios.
VMware Aria Operations Node | ||||
---|---|---|---|---|
Small | Medium | Large | Extra Large | |
Maximum number of nodes in each Continuous Availability fault-domain (*) | 1 | 4 | 8 | 6 |
* Each Continuous Availability cluster must have one Witness node which will require 2 vCPUs and 8GB of Memory.
Between fault-domains | Between witness node and fault-domains | |
---|---|---|
Latency | < 10ms, with peaks up to 20ms during 20sec intervals | < 30ms, with peaks up to 60ms during 20sec intervals |
Packet Loss | Peaks up to 2% during 20sec intervals | Peaks up to 2% during 20sec intervals |
Bandwidth | 10Gbits/sec | 10Mbits/sec |
Collect from larger vCenter Servers with up to 65000 objects by a scaling up a large Cloud Proxy (RC) to 8 vCPU and 32GB of RAM.
The collection process on a node will support adapter instances where the total number of objects is not more than 3,000, 8,500, 18,000 and 44,000 on small, medium, large and extra large multi-node VMware Aria Operations clusters respectively. For example, a 4-node system of medium nodes will support a total of 34,000 objects. However, if an adapter instance needs to collect 12,000 objects, a collector that runs on a medium node cannot support that as a medium node can only handle 8,500 objects. In this situation, you can add a large cloud proxy (RC) and pin the adapter instance to the cloud proxy(RC) or scale up by using a configuration that supports more objects.