Performance Insights - Configuration & Setup
search cancel

Performance Insights - Configuration & Setup

book

Article ID: 420866

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

  • This article introduces dependencies of Performance insights and all it's configuration and setup options. VCF Ops 9.1 ships with Proactive Insights. The feature automatically analyses clusters for performance bottlenecks in the background. This feature is enabled by default to scan clusters that match the following criteria:
    1. Run ESXi 9.1
    2. Belong to a VCF Instance that has Real Time Data Service configured
  • Performance insights is also a configuration driven service. The KB also introduces users to the various config options for the service itself - like disabling the feature altogether or conditionally removing certain clusters from scan.

Environment

  • VCF Operations 9.1
  • VMware vSAN 9.1

Cause

  • You see a proactive insight on VCF Ops 9.1 that reads "Not enough data to run certain checks"

OR

  • You see a banner on the UI stating "vSAN proactive insights are not available on some clusters, as they are supported only on VCF clusters 9.1 or above with Real-time metrics component installed."

OR

  • You want to disable the Proactive Scan feature

OR

  • You have a certain dev/test clusters in the fleet that you want to ignore from proactive scans

Resolution

Real Time Data Service

  • All 9.1 Proactive scans rely on metrics that come from Real Time Data Service and the vSAN Adapter.
  • The 9.1 release also includes a new check for the "Benchmarking for Throughput" workflow that relies on metrics from Real Time Data Service. Refer to Real Time Data Service to enable. Once Real Time Data Service is configured and data starts flowing into Ops, the feature will start running without the warnings for missing data. 
  • The metrics the service relies on from it are:

1. vSAN Metrics from ESXi Hosts

  • Performance → Network
    • RDT Network Host Average Latency
    • TCP Inbound Throughput
    • TCP Outbound Throughput
  • Performance → DOM Owner
    • Average Write Latency
      • Leaf Owner Write Latency
    • Recovery Write IOPS
  • Performance → Middle Layer
    • Average Write Latency
    • Bypass Write Percentage
    • There are two stats that get collected from the vSAN Adapter:

2. vSAN Metrics Collected against ESXi Hosts

  • Performance → DOM Component Manager
    • Latency Average Write


3. vSAN Metrics Collected against Disk Groups (OSA Only)

  • Congestion → SSD Congestion

 

  • If any of the above metrics are not being collected, the proactive insights engine shows an insight saying "Not enough data to run certain checks".
  • There is also a default collection policy that enlists the metrics that the Real Time Data Service should collect, please ensure that all the metrics defined above are checked in the policy.

Proactive Insights Configuration

  • Proactive diagnostics is a configuration driven service.
  • The service is on by default and scans every 9.1 cluster being managed by Ops running in a VCF Instance that has Real Time Data Service enabled on it.
  • However, it might be undesirable to spend resources on VCF Operations to scan clusters that might be running dev/test workloads or disable the feature altogether. Controlling the aforementioned configurations requires creating and saving a file in Solutions Configuration.
  • Prepare such a file e.g.

proactive_monitoring_config.json



    "enabled": true,
    "disabledClusters": [ 
        "9aee963e-f370-41e1-####-############",
        "d25a060b-8e02-453a-####-############" 
    ]
}

  • Navigate to the custom configurations page on the VCF Operations UI:

Infrastructure Operations → Configurations → Configuration Files → Management Pack Configurations → Add.

  • On the UI, enter the name of the file to match exactly: proactive_monitoring_config.json, Containing Folder: User Defined, Free format:
  • Copy paste the barebones structure into the text area. The below screenshot shows how the UI should look like.

 

 

 

  • Setting enabled  to false  will disable the whole feature. The other JSON array defines the IDs of the clusters which should be excluded from the scan. To find the IDs of such clusters, 
    1. Navigate to the Inventory Management UI: Infrastructure Operations → Configurations → Others → Inventory Management
    2. Expand Object Types
    3. Find and click on vSAN Cluster (<cluster>). (<cluster> being the name of the cluster for which we need to disable the diagnostic scan)
    4. Once the right side UI updates, you should see all vSAN clusters in the fleet.
    5. Configure the grid to show the object ID. Thats the ID of the cluster that you should copy and enter into the config. file.
    6. The VCF Operations UI provides an easy option to Copy Cell, when a user right clicks on the UUID.
    7. (Optional) If the number of clusters is very large, then use the filter at the right top to filter by the name of the cluster.