Node Unresponsive / Not ready due to Resource pool limits
search cancel

Node Unresponsive / Not ready due to Resource pool limits

book

Article ID: 418572

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

A Kubernetes Node (VM) deployed within a vSphere environment, often orchestrated via Telco Cloud Automation (TCA) or Tanzu Kubernetes Grid (TKG), enters an Unresponsive or Not Ready state. This state occurs when the VM is unable to acquire its required CPU \ Memory resources because the vSphere Resource Pool (RP) where it resides is configured with strict limits that are being met or exceeded.

Environment

VCenter 7.x,8.x

TCA 3.x

Cause

When the Resource Pool is defined with fixed limits, and the aggregate demand from the VMs within it (especially during high-load periods or auto-scaling events) exceeds the Resource Pool's configured Reservation or Limit, the vSphere Admission Control mechanism restricts resource availability. This forces the VMs to contend for the limited resources, resulting in CPU \ Memory throttling and the VM's vCPUs being placed in "adoption mode" (a state indicating severe resource starvation), leading to node unresponsiveness

Resolution

  1. Identify the Target Resource Pool (RP): Locate the Resource Pool associated with the failing Kubernetes cluster (the RP defined during the TCA/TKG cluster deployment).
  2. Right click Resource pool and select Edit resource settings.
  3. Ensure Expandable Reservation for both CPU \ Memory
  4. Then Enable Scalable Shares. (This allows the RP to use unreserved resources from its parent (the cluster) if its own guaranteed resources are depleted).

Additional Information

For more information refer resource pool configuration settings Tech Doc