Remove the 100Mhz CPU reservation request for the nodeconfig-daemon pods.
In Telco Cloud Automation (TCA) 2.1.1, to ensure fundamental services have sufficient resources, TCA will request a reservation of 100Mhz CPU for the nodeconfig-daemon pod, compared to 0Mhz in the previous releases of TCA.
In some environments, most of the vCPUs on the workload nodes are isolated for Network Function pods, leading to a lack of resources for reservations This can result in some pods getting stuck in a pending state due to “Insufficient CPU.”
Before starting a pod, kubelet will validate the resources being requested by the pod. If there are not enough resources available for the pod on any node, TCA will report an event and mark the pod as pending. Describing the node will point to insufficient CPU resources as the cause.
Example:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 31s default-scheduler 0/2 nodes are available:
1 Insufficient cpu,
1 node(s) didn't match
pod's node
affinity/selector.
This issue will be resolved in TCA 3.0.
Remove the CPU request for the nodeconfig-daemon pod on each work node.
1. SSH into the control plane of workload cluster as the capv user.
2. Remove the CPU request for the nodeconfig-daemon pod by running the following command:
curl -kfsSL 'https://vmwaresaas.jfrog.io/artifactory/cnf-generic-local/kb/20230508/remove_nc_ds_cpu_request.sh' | bash
3. The script output will print the information to indicate if the nodeconfig-daemon pod has refreshed.
Example:
Patch nodeconfig-operator addon successfully
Unpause pkgi/nodeconfig-operator successfully
Will check nodeconfig addon status after patch
Pkgi reconcile succeeded
All nodeconfig pods are running
Nodeconfig daemon Request CPU reduced successfully
Impacts TCA 2.1.1.