Performance Insights - Configuration & Setup

VCF Operations

This article introduces dependencies of Performance insights and all it's configuration and setup options. VCF Ops 9.1 ships with Proactive Insights. The feature automatically analyses clusters for performance bottlenecks in the background. This feature is enabled by default to scan clusters that match the following criteria:

1. Run ESXi 9.1
2. Belong to a VCF Instance that has Real Time Data Service configured

Performance insights is also a configuration driven service. The KB also introduces users to the various config options for the service itself - like disabling the feature altogether or conditionally removing certain clusters from scan.

You see a proactive insight on VCF Ops 9.1 that reads "Not enough data to run certain checks"

OR

You see a banner on the UI stating "vSAN proactive insights are not available on some clusters, as they are supported only on VCF clusters 9.1 or above with Real-time metrics component installed."

OR

OR

You have a certain dev/test clusters in the fleet that you want to ignore from proactive scans

Real Time Data Service

All 9.1 Proactive scans rely on metrics that come from Real Time Data Service and the vSAN Adapter.
The 9.1 release also includes a new check for the "Benchmarking for Throughput" workflow that relies on metrics from Real Time Data Service. Refer to Real Time Data Service to enable. Once Real Time Data Service is configured and data starts flowing into Ops, the feature will start running without the warnings for missing data.
The metrics the service relies on from it are:

1. vSAN Metrics from ESXi Hosts

Performance → Network
- RDT Network Host Average Latency
- TCP Inbound Throughput
- TCP Outbound Throughput
Performance → DOM Owner
- Average Write Latency
  - Leaf Owner Write Latency
- Recovery Write IOPS
Performance → Middle Layer
- Average Write Latency
- Bypass Write Percentage
- There are two stats that get collected from the vSAN Adapter:

2. vSAN Metrics Collected against ESXi Hosts

3. vSAN Metrics Collected against Disk Groups (OSA Only)

If any of the above metrics are not being collected, the proactive insights engine shows an insight saying "Not enough data to run certain checks".
There is also a default collection policy that enlists the metrics that the Real Time Data Service should collect, please ensure that all the metrics defined above are checked in the policy.

Proactive Insights Configuration

Proactive diagnostics is a configuration driven service.
The service is on by default and scans every 9.1 cluster being managed by Ops running in a VCF Instance that has Real Time Data Service enabled on it.
However, it might be undesirable to spend resources on VCF Operations to scan clusters that might be running dev/test workloads or disable the feature altogether. Controlling the aforementioned configurations requires creating and saving a file in Solutions Configuration.
Prepare such a file e.g.

proactive_monitoring_config.json

{
"enabled": true,
"disabledClusters": [
"9aee963e-f370-41e1-####-############",
"d25a060b-8e02-453a-####-############"
]
}

Infrastructure Operations → Configurations → Configuration Files → Management Pack Configurations → Add.

On the UI, enter the name of the file to match exactly: proactive_monitoring_config.json, Containing Folder: User Defined, Free format:
Copy paste the barebones structure into the text area. The below screenshot shows how the UI should look like.

Setting enabled to false will disable the whole feature. The other JSON array defines the IDs of the clusters which should be excluded from the scan. To find the IDs of such clusters,