Pods in TKGI cluster get stuck in Pending state and do not come up
search cancel

Pods in TKGI cluster get stuck in Pending state and do not come up

book

Article ID: 317029

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

Symptoms:
  • Pods from daemonset get stuck in pending state

  • Multiple daemonset impacted with the same issue 

  • It's the same worker node where the daemonset pods are not running 

  • Pods do not get scheduled and do not get an IP address

  • If you delete the pod, it again goes into the same pending state

  • Describe of the pod gives the below event 

Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 1 Insufficient pods, 2 node(s) didn't match node selector.


Environment

VMware Tanzu Kubernetes Grid Integrated Edition 1.x

Cause

The issue occurs because we are hitting the --max-pods limit on the worker node. Kubernetes supports no more than 110 pods per worker node.

Resolution

To resolve the issue, identify the worker node hitting the --max-pod limit and reschedule some of the pods from the worker node to the other nodes. 

Run the below command and validate if you can see 110 non-terminated pods in any worker nodes


kubectl describe node <Node_id> | grep -i pods