Pods remain in Pending state with "Insufficient memory" and "FailedScheduling"
search cancel

Pods remain in Pending state with "Insufficient memory" and "FailedScheduling"

book

Article ID: 429387

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

  • System or application pods such as may remain in a Pending state.
  • The kubectl describe pod command reveals a FailedScheduling event with messages indicating that although numerous nodes exist in the cluster, none are suitable candidates.
  • Symptoms:
    • Pod status is Pending.
    • Events report: 0/xxx nodes are available: 1 Insufficient memory, xxx node(s) didn't match Pod's node affinity/selector.
    • High memory reservation on specific worker nodes.

Environment

TCA3.2
TKG 2.5.2

Cause

The issue is driven by Resource Saturation coupled with Strict Scheduling Constraints:

  • Node Affinity/Selectors: The Pods are configured with nodeAffinity or nodeSelector that restricts them to a specific node or a very small subset of nodes. This explains why the scheduler ignores the majority of the available nodes.
  • Resource Reservation (Requests): Kubernetes schedules pods based on Requests, not actual consumption. On the candidate node(s), a single large application pod may have a very high memory request (e.g., ~92GB), consuming nearly 100% of the Node Allocatable capacity.
  • Insufficient Allocatable Memory: Once the large pod is scheduled, the remaining "Allocatable" memory is less than what the system DaemonSets require, preventing them from starting.

Resolution

To resolve this issue, increase the available "Allocatable" memory or relax the scheduling constraints.

Option 1: Resource Right-Sizing

Review the memory requests of the large application pods saturating the node.

  1. Identify the pod consuming the reservation:

    kubectl get pods -A -o custom-columns=NAME:.metadata.name,MEM_REQ:.spec.containers[*].resources.requests.memory

  2. If the application does not functionally require the full reservation (e.g., 92GB), update the Deployment/StatefulSet YAML to reduce resources.requests.memory to a value that reflects actual requirements.

Option 2: Vertical Scaling of Node Pool

If the application legitimately requires the high memory reservation:

  1. Increase the memory allocated to the Worker Node VMs.

  2. Ensure that the new memory size is: Required Application Memory + System Overhead (Kubelet/OS) + DaemonSet Requests.

Option 3: Adjust Scheduling Constraints

If the Pending pods do not strictly need to be on the saturated node:

  1. Review the nodeAffinity or nodeSelector in the Pod spec.

  2. Relax the constraints to allow these pods to schedule on other nodes among available in the cluster.

Additional Information

For more information on how Kubernetes calculates available memory, see Reserve Compute Resources for System Daemons.