TCA uses BOM's to store certain data, one of those pieces of data is the 'default configuration' for prometheus, also known as the 'golden configuration', this is divided into recording rules and other components.
Currently there isn't a straightforward CLI/API or other interfaces available for easily changing the golden configuration of Prometheus. Moreover, merging custom configurations is complex due to the extensive nature of Prometheus configurations, and making changes on golden configurations directly can lead to confusion.
capv@mc-1-fss67-ktbr4 [ ~ ]$ python promconf --help
usage: promconf [-h] [--kubeconfig KUBECONFIG] [--version] [-y]
{backup,restore,create,cascading-update} ...
Prometheus Configuration Improvement Tool
optional arguments:
-h, --help show this help message and exit
--kubeconfig KUBECONFIG
Specify the path to kubeconfig file
--version, -v Get the version information
-y, --yes Skip confirmation prompts
subcommands:
{backup,restore,create,cascading-update}
backup Backup Prometheus configuration in each workload
cluster
restore Restore Prometheus configuration in each workload
cluster
create Merge custom configuration and create corresponding
secret
cascading-update Cascading update Prometheus configuration in each
workload cluster with merged configuration
---
recording_rules_yml: |
groups:
- name: kube-apiserver.rules
interval: 3m
rules:
- expr: |2
(
(
sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
-
(
(
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d]))
or
vector(0)
)
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d]))
+
sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d]))
)
)
+
# errors
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d]))
)
/
sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
labels:
verb: read
record: apiserver_request:burnrate1d
- name: tkg-namespace-storage-rules
interval: 12m
rules:
- record: tkg_namespace_storage_pvc_bound
expr: sum by (namespace) ((kube_persistentvolumeclaim_status_phase{phase="Bound"}) > 0 or kube_pod_info < bool 0)
labels:
job: kubernetes-nodes-cadvisor
- record: tkg_namespace_storage_pvc_count
expr: sum by (namespace) ((kube_pod_spec_volumes_persistentvolumeclaims_info)> 0 or kube_pod_info < bool 0)
labels:
job: kubernetes-nodes-cadvisor
Create a merged configuration with the golden configuration using the following command:
# promconf create -f prometheus_recording_rules.yaml
# kubectl get secret prometheus-merged-config-secret -n tca-system
After a successful configuration merge, when performing a fresh install of the Prometheus server through the TCA UI, enabling the "golden configuration switch" will display the merged configuration by default. Otherwise, it will continue to show the original golden configuration.
# promconf cascading-update
# promconf backup
The format of backup file names is as follow:
<wc-name>-prometheus-tca-addon-secret.yaml
# promconf restore
When the TKG version changes, a new corresponding golden configuration is generated. In such cases, you will need to re-run the following command to create/update prometheus-merged-config-secret:
# promconf create --file