Pods in CrashLoopBackOff State due to OOMKilled in vSphere Kuberenetes Service
search cancel

Pods in CrashLoopBackOff State due to OOMKilled in vSphere Kuberenetes Service

book

Article ID: 439887

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • vmware-system-appplatform-operator-mgr and vmware-system-nsop-controller-manager pods are in CrashLoopBackOff state. 

  • Describing the pod returns that one or more containers are failing due to OOMKilled:

    kubectl describe pod -n <namespace> <pod name>
    finishedAt: "YYYY-MM-DDTHH:MM:SSZ"
            reason: OOMKilled
            startedAt: "YYYY-MM-DDTHH:MM:SSZ"
        name: manager

Environment

vSphere Kubernetes Service

Cause

The affected system pods default memory limits are unable to keep up with the large amount of resources needed by a large vSphere Supervisor environment. In case of upgrades or node recreations Kapp-controller will automatically revert changes made to defaults.

Resolution

Increase the limits of stateful sets and deployments on the supervisor cluster:

  • Increase the limits of stateful sets:
    k edit statefulset.apps/vmware-system-appplatform-operator-mgr -n vmware-system-appplatform-operator-system

  • Increase the limits of deployments: 
    kubectl -n vmware-system-nsop patch deployments vmware-system-nsop-controller-manager -p '{"spec":{"template":{"spec":{"containers":[{"name":"manager","resources":{"limits":{"memory":"1500Mi"}}}]}}}}'