storage-quota-webhook in CrashloopBackOff state with OOMKilled, increase the memory in the Supervisor Cluster's storage-quota-webhook pod

search cancel

storage-quota-webhook in CrashloopBackOff state with OOMKilled, increase the memory in the Supervisor Cluster's storage-quota-webhook pod

book

Article ID: 385180

calendar_today

Updated On:

Products

VMware vCenter Server 8.0 VMware vSphere Kubernetes Service

Issue/Introduction

Performing volume operations at large scale on supervisor cluster or TKG cluster(s) fails since the storage-quota-webhook pod(that validates each volume create/expand request) crashed due to inadequate memory being allotted to the pod.
In case volume operations such as volume expand, volume creation etc. are performed at large scale on supervisor cluster or TKG clusters and are taking a long time to complete or failing continuously, then that could be happening because of storage quota webhook pod crashing due to insufficient memory. To determine if this is the case, please follow these steps:
- SSH into the vCenter appliance:

ssh root@<VCSA_IP>

- Print the credentials used to login to the Supervisor control plane:

/usr/lib/vmware-wcp/decryptK8Pwd.py

- SSH into the Supervisor control plane using the IP and credentials from the previous step:

ssh root@<SUPERVISOR_IP>

- Check to see if storage-quota-webhook-* pod has crashed due to an Out of Memory (OOM) error:

kubectl -n kube-system \
describe pods -l control-plane=storage-quota-webhook | \
grep -F OOMKilled

- If OOMKilled is in the output from the above command, then the pod was terminated due to lack of sufficient memory.

Environment

vCenter Server 8.0.3

Cause

The storage-quota-webhook pod running on supervisor cluster has a 200Mi (MB) memory allocation by default, but because of the organic rise in the number of volume requests being monitored and validated by the webhook parallelly, more memory is being used for overall processing of these requests. The 200Mi hard limit is exceeded by this "burst" requirement.

Resolution

Currently there is no resolution.

Workaround

The memory limit of the storage-quota-webhook pod can be increased. The following steps describe how to increase the limit to 400Mi (MB):

SSH into the vCenter appliance:

ssh root@<VCSA_IP>

Print the credentials used to login to the Supervisor control plane:

/usr/lib/vmware-wcp/decryptK8Pwd.py

SSH into the Supervisor control plane using the IP and credentials from the previous step:

ssh root@<SUPERVISOR_IP>

Increase the memory limit to 400MB for the storage-quota-webhook pod using the following command:

kubectl -n kube-system \
patch deployments storage-quota-webhook \
-p '{"spec":{"template":{"spec":{"containers":[{"name":"manager","resources":{"limits":{"memory":"400Mi"}}}]}}}}'

Feedback

thumb_up Yes

thumb_down No