The issue is the same as KB How to Increase Multus-cni DaemonSet resource limits when Pods creation is failing intermittently which is used after Kubernetes 1.24.10 workload cluster is upgraded to 1.25, or Kubernetes workload cluster with version 1.25 and above created in TCA 3.0 and TCA 3.1.
This document contains the procedure to update the Multus resource limit on Kubernetes 1.24.10 before upgrade. Doing so, when the cluster is upgraded, the multus resource limit is increased on the new node and the issue with Multus high memory utilisation is avoided.
NOTE: Issue is only observed on workload cluster with Multus and Calico addon.
2.3, 3.1
Users have experienced OOM-killed calico-ipam processes when using multus+calico in certain clusters (likely higher scale). This causes intermittent issues when creating containers. The calico-ipam plugin was being OOM-killed in the multus-cni DaemonSet pod because the 50Mi memory limit was too low.
the limit hit in the log: "memory: usage 51200kB, limit 51200kB". So it requires to increase the memory request/limit on multus.
10. Click DEPLOY CHANGES at the bottom.
11. Wait for the add-on status changed to Provisioned.
Note the change via command line will be overwritten by update on TCA UI. So you need to edit Multus Addon on UI after upgrade as soon as possible.
1. Login to the TCA-CP where the Management cluster is deployed as admin user.
2. Run the below command as root to ssh to management cluster
su -
ssh capv@<management cluster endpoint IP>
3. Get the current multus values.yaml
kubectl -n <workload cluster name> get secret multus-tca-addon-secret -o "jsonpath={@.data.values\.yaml}"|base64 -d > multus.yaml
4. Add resources to multus.yaml
cat <<EOF>> multus.yaml
resources:
limits:
cpu: 300m
memory: 150Mi
requests:
cpu: 200m
memory: 100Mi
EOF
5. Apply the new values.yml to multus secret.
VALUES_YAML=`base64 -w0 multus.yaml`
kubectl patch secret -n <workload cluster name> multus-tca-addon-secret --patch '{"data":{"values.yaml":"'$VALUES_YAML'"}}'
There is no action needed.
Multus addon will be upgraded to version 4.0.1+vmware.2-tkg.1 . Thus resource requests and limits can be applied successfully to Multus DaemonSet.
This document should be used on Kubernetes 1.24.10 workload cluster with Multus and Calico addon only. Kubernetes 1.24.10 workload cluster can be created using TCA 2.3, or TCA 3.1.x which supports multi-tkg.