Following an Aria Ops upgrade from 8.14 to 8.18.x, the cluster Health Score behavior changed
search cancel

Following an Aria Ops upgrade from 8.14 to 8.18.x, the cluster Health Score behavior changed

book

Article ID: 418429

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

After upgrading Aria Operations from 8.14 to 8.18.x, the Cluster Health Score becomes highly variable or "noisy."

Before the upgrade, the Health Score remained near 100% unless a cluster issue occurred, providing a reliable overall health overview.

The score is now too noisy to be consumed reliably by NOC monitoring tools that depend on this metric.

Environment

Aria Operations 8.18.x

Cause

The Health Score calculation in 8.18.x now includes contributions from alert conditions (including descendant objects) that were previously excluded, resulting in increased noise.

New or changed alert definitions introduced during the upgrade are causing additional alerts to contribute to the Health Score calculation.

Resolution

Option 1: Replace Monitoring Metric with Badge Health
 

Change the metric that the NOC/monitoring integration uses to Badge Health.

Badge Health calculates health based only on alerts that directly impact the object. It excludes descendant objects, which produces a less noisy, object-focused health value.

  1. Verify which metric your monitoring tools currently read.
  2. Change the integration to use Badge Health.

 

Option 2: Create a Host-Specific Health Score Policy (Selective Alerts)
 

Create a separate Health Score policy for all hosts where only a curated set of alert definitions contribute to the Health Score. Disable the rest of the alert definitions.

  1. Export your current policy to a file.
  2. Edit the policy file to retain only the desired alert definitions.
  3. Import the modified policy, selecting the overwrite option.
  4. Validate the host Health Scores on representative hosts.
  5. Maintain the policy after subsequent upgrades (remove or adjust newly introduced alert definitions as required).

To simplify the maintenance process after upgrades, re-import the policy with the overwrite option to reapply the curated settings. Re-check the Health Scores after each Aria Operations upgrade because alert definitions can change.