This article provides information on using the sizing guidelines for VCF Operations 9.0, to determine the configurations used during installation, or post install.
Notes:
Recommended sizing can be checked by inputting values on Operations Sizing Tool, or by inputting values on the attached spreadsheet.
VCF Operations 9.0
Maximum number of objects in an extended cluster VCF Operations comes in 5 different sizes: Extra Small, Small, Medium, Large, and Extra Large configurations during installation. You can size the environment according to the existing infrastructure to be monitored. After the VMware Cloud Foundation Operations instance outgrows the existing size, you must expand the cluster to add nodes of the same size.
|
VCF Operations Node |
Cloud Proxy |
Unified Cloud Proxy (UCP) |
|||||||
Extra Small |
Small |
Medium |
Large |
Extra Large |
Small |
Standard |
Small |
Standard |
||
Objects and Metrics |
||||||||||
Single-Node maximum objects |
700 |
10,000 |
30,000 |
44,000 |
100,000 |
16,000 (4) |
80,000 (4) |
16,000(4) |
80,000(4) |
|
Single-Node maximum collected metrics (1) |
140,000 |
1,600,000 |
5,000,000 |
8,000,000 |
20,000,000 |
2,400,000 |
12,000,000 |
2,400,000 |
12,000,000 |
|
Maximum number of nodes in a cluster |
1 |
2 |
8 |
16 |
12 |
1000(9) |
||||
Multi-Node maximum objects per node |
N/A |
6,000 |
17,000 |
36,000 |
88,000 |
N/A |
||||
Multi-Node maximum metrics per node |
1,400,000 |
4,000,000 |
6,000,000 |
15,000,000 |
||||||
Maximum number of objects in a cluster |
700 |
12,000 |
136,000 |
576,000 |
1,056,000 |
|||||
Maximum number of metrics in a cluster |
140,000 |
2,800,00 |
32,000,000 |
81,600,000 |
126,000,000 |
|||||
Maximum number of objects in an extended cluster (2) |
840 |
13,200 |
149,600 |
633,600 |
1,161,600 |
|||||
Maximum number of metrics in an extended cluster (2) |
168,000 |
3,080,000 |
35,200,000 |
89,760,000 |
138,600,000 |
|||||
Configuration |
||||||||||
vCPU |
2 |
4 |
8 |
16 |
24 |
2 |
4 |
4 |
8 |
|
Default memory (GB) |
8 |
16 |
32 |
48 |
128 |
8 |
32 |
16 |
48 |
|
Maximum memory (GB) (2) |
16 |
32 |
64 |
96 |
256 |
N/A |
||||
vCPU: physical core ratio for nodes (3) |
1 vCPU to 1 physical core at scale maximums |
|||||||||
Network latency (5) |
< 5 ms |
< 500 ms |
||||||||
Network latency for agents to nodes/Cloud Proxies (5) |
< 20 ms |
|||||||||
Network latency between nodes/Cloud Proxies and endpoints |
< 50 ms |
|||||||||
Network bandwidth (Mbps) (6) |
N/A |
15 |
60 |
80 |
200 |
|||||
Datastore latency |
< 10 ms, with possible occasional peaks up to 15 ms |
|||||||||
IOPS |
See the Sizing Guide Worksheet for details |
|||||||||
Disk space |
See the Sizing Guide Worksheet for details |
|||||||||
Log Forwarder |
||||||||||
Maximum logs per second traffic (eps) |
N/A |
20,000 |
40,000 |
|||||||
Maximum number of connections |
N/A |
300 |
600 |
|||||||
Other Maximums |
||||||||||
Maximum number of Telegraf agents per node |
N/A |
500 |
3,000 |
500 |
3,000 |
|||||
Maximum number of vCenter adapter instances on a single collector |
5 |
25 |
50 |
100 |
120 |
25 |
100 |
25 |
100 |
|
Maximum number of Service Discovery objects |
N/A |
3,000 |
||||||||
Maximum number of concurrent users per node (7) |
10 |
N/A |
||||||||
Maximum certified number of concurrent users (8) |
300 |
|||||||||
Maximum number of concurrent API calls per client |
50 |
|||||||||
Maximum number of concurrent API calls per node |
300 |
Continuous Availability (CA) allows the cluster nodes to be stretched across two fault domains, with the ability to experience up to one fault domain failure and to recover without causing cluster downtime. CA requires an equal number of nodes in each fault domain and a witness node, in a third site, to monitor split brain scenarios.
VCF Operations Node |
||||
Small |
Medium |
Large |
Extra Large |
|
Maximum number of nodes in each Continuous Availability fault-domain (*) |
1 |
4 |
8 |
6 |
* Each Continuous Availability cluster must have one Witness node which will require 2 vCPUs and 8GB of Memory.
Between fault-domains |
Between witness node and fault-domains |
|
Latency |
< 10ms, with peaks up to 20ms during 20sec intervals |
< 30ms, with peaks up to 60ms during 20sec intervals |
Packet Loss |
Peaks up to 2% during 20sec intervals |
Peaks up to 2% during 20sec intervals |
Bandwidth |
10Gbits/sec |
10Mbits/sec |
The collection process on a node will support adapter instances with the total number of objects not exceeding 700, 10,000, 30,000, 44,000 and 100,000 on Extra Small, Small, Medium, Large and Extra Large multi-node VCF Operations clusters respectively. For example, a 2 node cluster when node size is Small will support a total of 2 times 6,000 objects which makes a total of 12,000 objects. However, in case the adapter instance needs to collect 15,000 objects, the collector running on a Small size node cannot support it, as the maximum object count supported by a small node is 6,000. The solution would be either to use a Cloud Proxy and have the Adapter Instance pinned to it or to scale up the cluster by using a configuration that supports more objects.
By default, VCF Operations for logs virtual appliance uses the preset values for all configurations.
You can change the appliance settings to meet the needs of the environment for which you intend to collect logs during deployment.
VCF Operations for logs provides preset VM (virtual machine) sizes that can be selected from to meet the ingestion requirements of your environment. These presets are certified size combinations of compute and disk resources, though you can add extra resources afterward. A small configuration is suitable only for demos.
To size virtual appliances to XL, XXL, and XXXL, see Vertical scaling in Aria Operations for Logs (Formerly vRealize Log Insight) 8.2 And Newer.
|
Node type | ||
---|---|---|---|
Preset Size |
Small | Medium | Large |
Log Ingestion Rate |
30 GB/day | 75 GB/day | 225 GB/day |
Virtual CPUs |
4 | 8 | 16 |
Memory |
8 GB | 16 GB | 32 GB |
IOPS | 500 | 1000 | 1500 |
Syslog Connections (Active TCP Connections) | 100 | 250 | 750 |
Events per Second | 2000 | 5000 | 15,000 |
You can use a syslog aggregator to increase the number of syslog connections through which events are sent to VCF Operations for Logs. However, the maximum number of events per second is fixed and does not depend on the use of a syslog aggregator. A VCF Operations for logs instance cannot be used as a syslog aggregator. The sizing is based on the following assumptions.
Each virtual CPU is at least 2 GHz.
Each ESXi host sends up to 10 messages per second with an average message size of 170 bytes/message, which is roughly equivalent to 150 MB per day, per host.
NOTE
For large installations, you must upgrade the virtual hardware version of the VCF Operations for logs virtual machine. VCF Operations for logs supports virtual hardware version 7 or later. Virtual hardware version 7 can support up to 8 virtual CPUs. Therefore, if you plan to provision 16 virtual CPUs, you must upgrade to virtual hardware version 8 or later for ESXi 7.x. You use the vSphere Client to upgrade the virtual hardware. If you want to upgrade the virtual hardware to the latest version, read and understand the information in the VMware knowledge base article Upgrading a virtual machine to the latest hardware version.
Use the Medium configuration, or larger, for the primary and worker nodes in a VCF Operations for logs cluster. The number of events per second increases linearly with the number of nodes. For example, in a cluster of 18 large nodes (clusters must have a minimum of three nodes), the ingestion for will be 18x15000 making up 270,000 events per second (EPS), or 4 TB of events per day.
Use the Small configuration of the appliance in a proof-of-concept or test environment, but not in a production environment.
An estimator to help you determine sizing for VCF Operations for logs including calculation for network bandwidth and storage utilization is also available. This sizing estimator is intended for guidance only. Many environment inputs are site-specific, so the calculator necessarily uses estimations in some areas. See https://vrlisizer.broadcom.com.
NOTE
The overall performance of VCF Operations for logs might degrade if forwarders are defined against the text field with complex or multiple conditions involving regular expressions, for example "text=~"Deleting the machine". In such cases, specifically when the overall load on the cluster is high, performance might be delayed, and disk blocks might accumulate on each node of the cluster.
Item |
Maximum |
---|---|
Node Configuration |
|
CPU |
16 vCPUs |
Memory |
32 GB |
Storage device (vmdk) |
2 TB - 512 bytes |
Total addressable storage |
6 TB (+ OS drive) A maximum of 6 TB addressable log storage on Virtual Machine Disks (VMDKs) with a maximum size of 2 TB each. (1) |
Number of syslog connections per node |
750 |
Cluster Configuration |
|
Nodes |
18 (Primary + 17 Workers) |
Virtual IP addresses |
60 |
Ingestion |
|
Events per second |
15,000 eps per node |
Syslog message length |
10 KB (text field) per log |
Ingestion API HTTP POST request |
16 KB (text field); 4 MB per HTTP Post request |
Integrations |
|
VCF Operations |
1 |
vCenter |
15 per node |
VMware SSO |
1 |
Active Directory domains |
1 |
Email servers |
1 |
DNS servers |
2 |
NTP servers |
4 |
Forwarders |
10 |
Index Partition Configuration |
|
Index partitions |
10 |
NOTE