Pods in TKGI cluster get stuck in Pending state and do not come up

search cancel

Pods in TKGI cluster get stuck in Pending state and do not come up

book

Article ID: 317029

calendar_today

Updated On: 08-27-2021

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

Symptoms:

Pods from daemonset get stuck in pending state
Multiple daemonset impacted with the same issue
It's the same worker node where the daemonset pods are not running
Pods do not get scheduled and do not get an IP address
If you delete the pod, it again goes into the same pending state
Describe of the pod gives the below event

Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 1 Insufficient pods, 2 node(s) didn't match node selector.

Environment

VMware Tanzu Kubernetes Grid Integrated Edition 1.x

Cause

The issue occurs because we are hitting the --max-pods limit on the worker node. Kubernetes supports no more than 110 pods per worker node.

Resolution

To resolve the issue, identify the worker node hitting the --max-pod limit and reschedule some of the pods from the worker node to the other nodes.

Run the below command and validate if you can see 110 non-terminated pods in any worker nodes

kubectl describe node <Node_id> | grep -i pods

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No