Adding data to existing metrics, need help with understanding impact on cardinality
search cancel

Adding data to existing metrics, need help with understanding impact on cardinality

book

Article ID: 405782

calendar_today

Updated On:

Products

Observability DX OpenExplore

Issue/Introduction

In the process of adding new labels which could increase the cardinality.  Want to ensure addition of these data do not lead to slowness of query in this dashboard.

  • Proactively investigation on changes to current metrics that are expected to increase cardinality.
  • Concerns over slower response times and performance issues
  • Introducing a new source fields for new time series data with potentially highly variable field.
  • Adding new labels to Kubernetes metrics.
  • Is there a standard for how many data points per second (PPS) before seeing a performance impact?

Resolution

Anticipating potential performance impacts when introducing new dimensions before your data is ingested is difficult, however below are a few options for gathering details on possible impact before implementing in production.

If you have a Non-Production or Development Tenant, it is recommended to test changes there before implementing on Production environments.

  • If there are critical or highly used Dashboards, Charts and Alerts that you are concerned about they can be exported/imported into your alternate tenant for real-world testing.

If you do not have an alternate tenant, you can use a sub-set of resources in your production tenant for real world examples of before and after impacts on a small scale. 

  • Identifying available non-production Kubernetes sources to implement the changes on first.
  • Cloned or recreate your critical or known query intensive Dashboards, Charts or Alerts, updating them to only use your non-production test sources.

In Observability performance impacts come from the amount of unique time series (Cardinality), the number of data points that were queried to build the chart (Points Scanned) and the amount of time the query takes to return data to the chart (Duration).  
Impacts of increasing your data point ingestions, Points-Per-Seconds (PPS), would be seen in delays to ingestion (Backlog) and increase resources such as CPU & Memory on your Proxies. 

  • These changes can be track in the out-of-the-box Dashboard - Tanzu Observability Service and Proxy Data.

 

Additional Information

Below are article links for these any other topics to help identify query performance impacts and suggestions for improvements to your queries.

Additional related topics for your review.