Customize recording rules for Prometheus golden configurations
search cancel

Customize recording rules for Prometheus golden configurations

book

Article ID: 345712

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

We need to provide a script to make modifying the Prometheus reference configuration more convenient, precise, and secure. This script tool can automatically merge custom configurations with the golden configuration. It will automatically implement the merged configuration for freshly installed Prometheus, and it should provide an option to update the configuration of the existing Prometheus in each workload cluster.

Symptoms:

TCA uses BOM's to store certain data, one of those pieces of data is the 'default configuration' for prometheus, also known as the 'golden configuration', this is divided into recording rules and other components. 

  • After the upgrade, the current recording rules might be broken with the new version of TKG, user need to manually update the referenced recording rules to ensure the proper functioning of Prometheus.
  • Additionally, users may add custom configurations based on their requirements.

Currently there isn't a straightforward CLI/API or other interfaces available for easily changing the golden configuration of Prometheus. Moreover, merging custom configurations is complex due to the extensive nature of Prometheus configurations, and making changes on golden configurations directly can lead to confusion.


Resolution

The "promconf" command is provided for the prometheus configuration enhancement as a script,  and it should be executed under the context of the target TKG management cluster, you can download the script here (Attached to the KB):

Usage: 
 
capv@mc-1-fss67-ktbr4 [ ~ ]$ python promconf --help
 
usage: promconf [-h] [--kubeconfig KUBECONFIG] [--version] [-y]
                {backup,restore,create,cascading-update} ...
 
Prometheus Configuration Improvement Tool
 
optional arguments:
  -h, --help            show this help message and exit
  --kubeconfig KUBECONFIG
                        Specify the path to kubeconfig file
  --version, -v         Get the version information
  -y, --yes             Skip confirmation prompts
 
subcommands:
  {backup,restore,create,cascading-update}
    backup              Backup Prometheus configuration in each workload
                        cluster
    restore             Restore Prometheus configuration in each workload
                        cluster
    create              Merge custom configuration and create corresponding
                        secret
    cascading-update    Cascading update Prometheus configuration in each
                        workload cluster with merged configuration

Add customized rules

  • Users need to provide recording rules configurations in the following format, including a list of custom groups. Merge algorithm:
    • Merged configurations is a copy of default configurations
    • Groups declared in custom configurations will be appended if new to default golden configurations
    • A group declared in custom configurations will overwrite the entire group in default golden configurations if group name exists there
---
recording_rules_yml: |
  groups:
    - name: kube-apiserver.rules
      interval: 3m
      rules:
      - expr: |2
          (
            (
              sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
              -
              (
                (
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d]))
                  or
                  vector(0)
                )
                +
                sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d]))
                +
                sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d]))
              )
            )
            +
            # errors
            sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d]))
          )
          /
          sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
        labels:
          verb: read
        record: apiserver_request:burnrate1d
    - name: tkg-namespace-storage-rules
      interval: 12m
      rules:
      - record: tkg_namespace_storage_pvc_bound
        expr: sum by (namespace) ((kube_persistentvolumeclaim_status_phase{phase="Bound"}) > 0 or kube_pod_info < bool 0)
        labels:
          job: kubernetes-nodes-cadvisor
      - record: tkg_namespace_storage_pvc_count
        expr: sum by (namespace) ((kube_pod_spec_volumes_persistentvolumeclaims_info)> 0 or kube_pod_info < bool 0)
        labels:
          job: kubernetes-nodes-cadvisor
 
  • Save the above content in a YAML file with any name, for example, prometheus_recording_rules.yaml.
  • Create a merged configuration with the golden configuration using the following command:
     

    # promconf create -f prometheus_recording_rules.yaml
    
  • Check if the prometheus-custom-config-secret secret is successfully created/updated. Usually, the presence of prometheus-merged-config-secret indicates a successful configuration merge (the secret will be deleted if the merge fails).
# kubectl get secret prometheus-merged-config-secret -n tca-system

Newly deployed prometheus addon

After a successful configuration merge, when performing a fresh install of the Prometheus server through the TCA UI, enabling the "golden configuration switch" will display the merged configuration by default. Otherwise, it will continue to show the original golden configuration.

Cascade update existing prometheus addon

  • For an already installed Prometheus service, if you want to replace the effective configuration in the Prometheus server with the merged configuration, execute the following command:
# promconf cascading-update
  • Before updating the Prometheus server configuration using the above command, it's advisable to backup the existing configuration with the following command:
# promconf backup

The format of backup file names is as follow:

<wc-name>-prometheus-tca-addon-secret.yaml

  • This command will save the configuration of the Prometheus server secrets as a YAML file in the /tmp/prometheusConfig directory.
    If any issues arise after the cascading-update, you can use the following command to restore the configuration to what it was before the cascading update:
# promconf restore

Upgrade

When the TKG version changes, a new corresponding golden configuration is generated. In such cases, you will need to re-run the  following command to create/update prometheus-merged-config-secret:
 

# promconf create --file 


 









Attachments

promconf get_app