Troubleshooting CPU Throttling Related Issues On TKG Clusters Deployed By TCA/TKGm
search cancel

Troubleshooting CPU Throttling Related Issues On TKG Clusters Deployed By TCA/TKGm

book

Article ID: 387150

calendar_today

Updated On: 01-31-2025

Products

VMware Telco Cloud Automation

Issue/Introduction

This knowledge base article outlines common CPU-related issues encountered in TKG environments and provides troubleshooting steps to diagnose and resolve them.

Environment

2.x, 3.x

Resolution

  1. Connect to the cluster where you see high CPU usage & identify the affected pod(s):

    kubectl top pods -n <namespace>: This command shows CPU and memory usage for pods in a specific namespace. Identify pods with consistently high CPU usage or that are experiencing throttling.   

    kubectl describe pod <pod-name> -n <namespace>: Examine the pod's details, including resource requests and limits, events, and container status. Look for warnings or errors related to CPU.   

  2.  Investigate CPU Throttling:

    kubectl top pods -n <namespace>: Observe the CPU usage and compare it to the CPU limits defined for the pod. If the usage is consistently close to or exceeding the limit, throttling is likely occurring.

    kubectl describe pod <pod-name> -n <namespace>: Look for events related to CPU throttling. Kubernetes will often log events when a pod is being throttled.

    Check Container Logs: Application logs may provide clues about performance degradation or resource constraints.

  3.  Analyze High CPU Usage:

    kubectl exec -it <pod-name> -n <namespace> -- bash: Enter the container and use tools like top, htop, or perf to identify the processes consuming the most CPU.

    Profiling Tools: Use profiling tools to identify performance bottlenecks within the application code.

    Resource Limits: Ensure that the pod has appropriate CPU limits set. If not, set them to prevent runaway processes from consuming all available resources.

  4. Verify CPU Requests and Limits:

    kubectl describe pod <pod-name> -n <namespace>: Review the pod's resource requests and limits in the pod specification.   

    Adjust Requests and Limits: If the requests are too low, the pod may not get enough CPU to function correctly. If the limits are too high, the pod may consume resources that are needed by other pods. Adjust these values based on the application's needs.

  5. Check Node CPU Saturation:

    kubectl top nodes: This command shows CPU and memory usage for all nodes in the cluster. Identify nodes with consistently high CPU utilization.   

    Investigate Node Processes: If a node is saturated, investigate the processes running on the node to identify the source of the high CPU usage. This may involve logging into the node directly.

    Scale Up or Add Nodes: If the node is consistently overloaded, consider scaling up the existing nodes or adding more nodes to the cluster.

  6. Resource Quotas:

    kubectl describe resourcequota <resource-quota-name> -n <namespace>: Check if resource quotas are in place and if they are limiting the CPU resources available to the namespace.