Pods remain in Pending state with "Insufficient memory" and "FailedScheduling"

search cancel

book

calendar_today

VMware Telco Cloud Automation

System or application pods such as may remain in a Pending state.
The kubectl describe pod command reveals a FailedScheduling event with messages indicating that although numerous nodes exist in the cluster, none are suitable candidates.
Symptoms:
- Pod status is Pending.
- Events report: 0/xxx nodes are available: 1 Insufficient memory, xxx node(s) didn't match Pod's node affinity/selector.
- High memory reservation on specific worker nodes.

TCA3.2
TKG 2.5.2

The issue is driven by Resource Saturation coupled with Strict Scheduling Constraints:

Node Affinity/Selectors: The Pods are configured with nodeAffinity or nodeSelector that restricts them to a specific node or a very small subset of nodes. This explains why the scheduler ignores the majority of the available nodes.
Resource Reservation (Requests): Kubernetes schedules pods based on Requests, not actual consumption. On the candidate node(s), a single large application pod may have a very high memory request (e.g., ~92GB), consuming nearly 100% of the Node Allocatable capacity.
Insufficient Allocatable Memory: Once the large pod is scheduled, the remaining "Allocatable" memory is less than what the system DaemonSets require, preventing them from starting.

To resolve this issue, increase the available "Allocatable" memory or relax the scheduling constraints.

Review the memory requests of the large application pods saturating the node.

Identify the pod consuming the reservation:

kubectl get pods -A -o custom-columns=NAME:.metadata.name,MEM_REQ:.spec.containers[*].resources.requests.memory
If the application does not functionally require the full reservation (e.g., 92GB), update the Deployment/StatefulSet YAML to reduce resources.requests.memory to a value that reflects actual requirements.

If the application legitimately requires the high memory reservation:

Increase the memory allocated to the Worker Node VMs.
Ensure that the new memory size is: $Required\,Application\,Memory + System\,Overhead\,(Kubelet/OS) + DaemonSet\,Requests$ .

If the Pending pods do not strictly need to be on the saturated node:

Review the nodeAffinity or nodeSelector in the Pod spec.
Relax the constraints to allow these pods to schedule on other nodes among available in the cluster.

For more information on how Kubernetes calculates available memory, see Reserve Compute Resources for System Daemons.

thumb_up Yes

thumb_down No