In a cluster running Tanzu Mission Control Self-Managed (TMC SM) system pods, the prometheus-server pod is in CrashLoopBackOff:
The exact name of the prometheus-server pod will vary by environment.
kubectl get pods -n <tmc sm namespace> | grep prometheus
NAME READY STATUS
prometheus-server 1/2 CrashLoopBackOff
Performing a describe of the above prometheus-server pod will show that its prometheus container is failing in OOMKilled state:
kubectl describe pod -n <tmc sm namespace> <prometheus server pod name>
prometheus:
State: CrashLoopBackOff
Reason: OOMKilledTanzu Mission Control Self-Managed (TMC SM) 1.4.2
The default Prometheus-server memory limit is set to 1Gi.
Prometheus-server may need more memory depending on your environmental needs and fail repeatedly on OOMKilled state as a result.
The values YAML that was used in the initial installation of Tanzu Mission Control Self-Managed (TMC SM) has a dedicated parameter for the memory limits of the prometheus-server and can be updated accordingly.
Note: Changes made directly to pods and the managing kubernetes object for those pods (deployment, statefulset, daemonset, etc.) will be reverted automatically to defaults or values configured in the YAML used to install TMC SM.
prometheus:
memoryLimit: <desired value in Gi>Gi
More configurable parameters can be found in the below TMC SM documentation:
Configuration Key Values for Installing Tanzu Mission Control Self-Managed