After migrating the application from Kubernetes cluster version 1.25 to 1.30, significant slowdown was observed in the application’s processing queue. Tasks that previously completed within expected timeframes were delayed, causing downstream bottlenecks and impacting overall system responsiveness. The upgrade included a change in node pool architecture, shifting from many smaller worker nodes (16 vCPUs × ~25 nodes) in 1.25 to fewer, oversized nodes (32 vCPUs × ~5 nodes) in 1.30 due to NSX load balancer limitations. Despite maintaining the same number of replicas, performance degradation was observed.
How to Check in esxtop:
The performance regression was caused by oversized worker VMs. The 32 vCPU nodes introduced severe CPU scheduling contention at the ESXi layer, evidenced by CPU Ready times of ~54% and non-zero Co-Stop values. This co-scheduling overhead reduced the ability of the hypervisor to allocate CPU cycles efficiently. Smaller nodes (14-19 vCPUs) on the same hosts did not exhibit this behavior, confirming node size as the root cause.
To resolve the issue, create a new node pool using smaller VMs, ideally ≤16 vCPUs per worker node, and scale out horizontally rather than scaling up. After resizing, monitor CPU Ready times in vCenter/ESXi to ensure they drop below 5% (ideally <2%). This configuration alleviates CPU contention, improves pod scheduling, restores throughput, and resolves the observed application slowdown. VMware’s vSphere 8.0 Performance Best Practices support right-sizing VMs to avoid oversized CPU allocations.