This article is to provide a guide to updating the CPU and Memory values for control plane nodes in a legacy cluster.
This change would be required in an environment where control plane nodes CPU and Memory utilization are too high and control plane node crashes are occurring.
TKGm
In this scenario there was a single crashing Control Plane node.
The control plane node was seen crashing with 100% CPU 0% Memory and 0% network and with no IP assigned.
In it's corresponding control plane nodes it was noted that their CPU utilization was oscillating from low to high consistently and the etcd database was relatively high between 100-500MB bracket.
The ideas of the below procedure is to create a new vSphereMachineTemplate with the required CPU and Memory requirements and then to update the KCP with the newly created vSphereMachineTemplate with amened CPU and Memory requirements.
Note: this will recreate the control plane nodes of the legacy cluster you are making the change on.
Connect to the management cluster context and start by checking the KCP for the cluster's vSphereMachineTemplate name and namespace.
This can be done with the below command:
kubectl get kcp <cluster-name>-control-plane -oyaml | grep -A 2 -Ei "kind: VSphereMachineTemplate"
The output of above will look similar to below:
kind: VSphereMachineTemplate
name: <cluster-name>-control-plane
namespace: default
Next check the vSphereMachineTemplate using the name from the above output with the below command:
kubectl get VSphereMachineTemplate <cluster-name>-control-plane -oyaml | less
From the output yaml you will see the name, CPU and Memory that needs to be changed
kind: VSphereMachineTemplate
metadata:
vmTemplateMoid: vm-xxxx
creationTimestamp: "2024-12-17T11:40:02Z"
generation: 1
name: <cluster-name>-control-plane
...
memoryMiB: xxxx
...
numCPUs: x
Note - in your environment memoryMiB may be 8G and numCPUs may be 2
To scale up or vertically simply follow below instructions:
Create a VSphereMachineTemplate yaml for the desired CPU and Memory changes
kubectl get VSphereMachineTemplate <cluster-name>-control-plane -oyaml > <cluster-name>-control-plane-cpu8-mem16.yaml
Next vi into this yaml file and make the required changes:
vi <cluster-name>-control-plane-cpu8-mem16.yaml
After vi to the yaml file, update the vSphereMachineTemplate name to - <cluster-name>-control-plane-cpu8-mem16 , and numCPUs to 8, and memoryMiB to 16
Also remove the annotation kubectl.kubernetes.io/last-applied-configuration: and line beneath it as per below for example.
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"infrastructure.cluster.x-k8s.io/v1beta1","kind":"VSphereMachineTemplate","metadata":{"annotations":{"vmTemplateMoid":"vm-1268"},"name":"<cluster-name>-control-plane","namespace":"default"},"spec":{"template":{"spec":{"cloneMode":"fullClone","datacenter":"/Datacenter","datastore":"/Datacenter/datastore/vsanDatastore","diskGiB":60,"folder":"/Datacenter/vm/env11","memoryMiB":4096,"network":{"devices":[{"dhcp4":true,"networkName":"/Datacenter/network/vsanSwitch-xxxxxxx"}]},"numCPUs":4,"resourcePool":"/Datacenter/host/Cluster/Resources/xxxx","server":"xxx-xxxx-xxxx.xxxx.xxxxx.xxxxx","storagePolicyName":"","template":"/Datacenter/vm/tkg/photon-5-kube-v1-28-11+vmware-2-tkg-2-bc1be57677254736xxxxxxxxxxxxxxxx"}}}}
Next save the vSphereMachineTemplate yaml by typing ':wq!' - without the quote '', when in vi mode - this will write changes and close the file.
Now apply the created vSphereMachineTemplate yaml as below:
kubectl apply -f workload-slot1rp11-2-control-plane-cpu8-mem16.yaml
You will see the output below thereafter:
vspheremachinetemplate.infrastructure.cluster.x-k8s.io/<cluster-name>-control-plane-cpu8-mem16 created
Check the VSphereMachineTemplate's creation using below command:
kubectl get VSphereMachineTemplate
You will see the below output:
NAME AGE
<cluster-name>-control-plane 32m
<cluster-name>-control-plane-cpu8-mem16 12s
<cluster-name>-worker 32m
Confirm the creation has the changes made with below command:
kubectl get VSphereMachineTemplate <cluster-name>-control-plane-cpu8-mem16 -oyaml | less
Next is to copy the KCP to a yaml and vi to the yaml to edit:
kubectl get kcp <cluster-name>-control-plane -oyaml > kcp-<cluster-name>-control-plane-cpu8-mem16.yaml
Vi into the KCP yaml to make the required changed:
vi kcp-<cluster-name>-control-plane-cpu8-mem16.yaml
Update the name of the VSphereMachineTemplate in the KCP to <cluster-name>-control-plane-cpu8-mem16.
Then save the changes by typing ':wq!' - without the quote '', when in vi mode - this will write changes and close the file.
Finally apply this new KCP to the cluster.
kubectl apply -f kcp-workload-slot1rp11-2-control-plane-cpu8-mem16.yaml
The output will look similar to below:
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/workload-slot1rp11-2-control-plane configured
Check the KCP has the correct vSphereMachineTemplate name:
kubectl get kcp <cluster-name>-control-plane -oyaml | grep -A 2 -Ei "kind: VSphereMachineTemplate"
The output will look as per below:
kind: VSphereMachineTemplate
name: <cluster-name>-control-plane-cpu8-mem16
namespace: default
Switch to the cluster's context and confirm the Control Plane has been recreated recently:
kubectl get nodes
You will see an output similar to below showing the control plane's age with a recent timestamp.
NAME STATUS ROLES AGE VERSION
<cluster-name>-control-plane-ghh2n Ready control-plane 15s v1.28.11+vmware.2
<cluster-name>-md-0-rfmtz-lzmtd Ready <none> 46m v1.28.11+vmware.2
Confirm the Control Plane has been updated with the required resources by running below command:
kubectl get nodes workload-slot1rp11-2-control-plane-ghh2n -oyaml | grep -A 15 -Ei "allocatable"
Where the output will have a section similar to below:
allocatable:
cpu: "8"
ephemeral-storage: "56888314582"
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 16269092Ki
pods: "110"
capacity:
cpu: "8"
ephemeral-storage: 61727772Ki
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 16371492Ki
pods: "110"
Note - cpu: is now 8 and memory: is now 16269092Ki or 16G
The scale up/vertical scaling of the control plane node(s) in your legacy cluster has completed.