TCA uses BOM's to store certain data, one of those pieces of data is the 'default configuration' for prometheus, also known as the 'golden configuration', this is divided into recording rules and other components.
Currently there isn't a straightforward CLI/API or other interfaces available for easily changing the golden configuration of Prometheus. Moreover, merging custom configurations is complex due to the extensive nature of Prometheus configurations, and making changes on golden configurations directly can lead to confusion.
capv@mc-1-fss67-ktbr4 [ ~ ]$ python promconf --help usage: promconf [-h] [--kubeconfig KUBECONFIG] [--version] [-y] {backup,restore,create,cascading-update} ... Prometheus Configuration Improvement Tool optional arguments: -h, --help show this help message and exit --kubeconfig KUBECONFIG Specify the path to kubeconfig file --version, -v Get the version information -y, --yes Skip confirmation prompts subcommands: {backup,restore,create,cascading-update} backup Backup Prometheus configuration in each workload cluster restore Restore Prometheus configuration in each workload cluster create Merge custom configuration and create corresponding secret cascading-update Cascading update Prometheus configuration in each workload cluster with merged configuration
--- recording_rules_yml: | groups: - name: kube-apiserver.rules interval: 3m rules: - expr: |2 ( ( sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d])) - ( ( sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d])) or vector(0) ) + sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d])) + sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d])) ) ) + # errors sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d])) ) / sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d])) labels: verb: read record: apiserver_request:burnrate1d - name: tkg-namespace-storage-rules interval: 12m rules: - record: tkg_namespace_storage_pvc_bound expr: sum by (namespace) ((kube_persistentvolumeclaim_status_phase{phase="Bound"}) > 0 or kube_pod_info < bool 0) labels: job: kubernetes-nodes-cadvisor - record: tkg_namespace_storage_pvc_count expr: sum by (namespace) ((kube_pod_spec_volumes_persistentvolumeclaims_info)> 0 or kube_pod_info < bool 0) labels: job: kubernetes-nodes-cadvisor
Create a merged configuration with the golden configuration using the following command:
# promconf create -f prometheus_recording_rules.yaml
# kubectl get secret prometheus-merged-config-secret -n tca-system
After a successful configuration merge, when performing a fresh install of the Prometheus server through the TCA UI, enabling the "golden configuration switch" will display the merged configuration by default. Otherwise, it will continue to show the original golden configuration.
# promconf cascading-update
# promconf backup
The format of backup file names is as follow:
<wc-name>-prometheus-tca-addon-secret.yaml
# promconf restore
When the TKG version changes, a new corresponding golden configuration is generated. In such cases, you will need to re-run the following command to create/update prometheus-merged-config-secret:
# promconf create --file