For workload clusters in TCA 3.2 , it is observed that tca-monitor-operator pod is in a crashloopbackoff due to OOM.
Use below command to check the pods status:
kubectl get pod -A | grep tca-monitor
On describing the pod below reason is seen:
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
3.1.x
3.2
This is due to the large number and size of the secrets which take up all the allowed memory for this container and eventually OOM killed it.
Resolved in TCA 3.3
The workaround is to pause the monitor package and edit the memory limit of the monitor deployment.
Follow the below steps :
kubectl patch pkgi -n tca-system tca-monitor-operator --type merge -p '{"spec": {"paused": true}}'
kubectl edit
" command to edit the tca-monitor-operator deployment and set the memory limit:kubectl edit deploy -n tca-system tca-monitor-operator
About paused state, the monitoring operation performed by the monitor-operator won't be impacted by this state. The only required action is when TCA 3.3 or future releases of TCA is released, and users want to update the workload cluster to the versions supported in new TCA versions, then before updating the workload clusters, user should unpause this package so the package will be updated together.