CoreDNS pods are in a CrashLoopBackOff state due to an OOMKilled error
search cancel

CoreDNS pods are in a CrashLoopBackOff state due to an OOMKilled error

book

Article ID: 421757

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • CoreDNS pods are scheduled on 2 specific worker nodes.
  • No memory pressure is observed on the nodes themselves.

Environment

vSphere Kubernetes Service

Cause

The high resource utilization during the CoreDNS pod's initialization significantly exceeds the standard operational baselines. 

Resolution

Scale down and scale up the replicas of the CoreDNS deployment using the following commands:  

  • Scale down: kubectl scale deployment coredns -n kube-system --replicas=0
  • Scale up: kubectl scale deployment coredns -n kube-system --replicas=2

Note: A rollout restart can hang if new Pods fail to pass readiness probes, as the Deployment Controller will not terminate old Pods until replacements are healthy. If the rollout is stuck due to configuration errors (like invalid images), scaling the deployment to zero and then back up can force a clean slate. However, be aware that this causes temporary downtime. 

Additional Information

  • By default, the CoreDNS corefile does not have the log directive enabled, so CoreDNS will not show DNS queries. 
  • To debug OOM issues, logging can be enabled after editing the CoreDNS ConfigMap
  • Edit the CoreDNS ConfigMap: kubectl -n kube-system edit configmap coredns

    Inside the Corefile block, add the keyword 'log' after this line:
    .:53 {

    So it becomes:
     .:53 {
          log
          errors
          health { lameduck 5s }
          ready
          ...
    }