In a vSphere Supervisor environment, when attempting to create or scale up nodes in a workload cluster, the following error message is reported:
Insufficient configured resources to satisfy the desired vSphere HA failover level on the cluster
There are sufficient resources available in the environment. CPU and Memory Utilization are not close to the maximum capacity.
When viewing the vSphere cluster object hosting the workload clusters, the Resource Allocation shows that CPU and/or Memory is close to or at maximum Reservation:
When viewing the existing running nodes in the Supervisor cluster context, there are a large number of nodes on a guaranteed vmclass:
The below vmclass is an example and may vary based on your environment.
kubectl get vm -o wide -A
NAMESPACE NAME POWER-STATE CLASS
<namespace> <node-a> poweredOn guaranteed-2xlarge
<namespace> <node-b> poweredOn guaranteed-2xlarge
<namespace> <node-c> poweredOn guaranteed-2xlarge
<namespace> <node-d> poweredOn guaranteed-2xlarge
<namespace> <node-e> poweredOn guaranteed-2xlarge
<namespace> <node-f> poweredOn guaranteed-x2large
The assigned guaranteed vmclass can also be reserving a large amount of resources:
The below vmclass is an example and may vary based on your environment.
kubectl get vmclass
NAME CPU MEMORY
guaranteed-2xlarge 8 64Gi
vSphere Supervisor
This issue can occur regardless of whether or not the cluster is managed by Tanzu Mission Control (TMC)
Although there is enough CPU and Memory available at the vSphere cluster level, the Resource Allocation for CPU and Memory is close to or at the maximum.
If the Reservation Details bar is close to or at maximum, that indicates that these resources may not currently be in use but have been reserved and cannot be allocated to new nodes.
Guaranteed vmclasses ensure that the given resources will always be available for the nodes. However, those resources will be set as reserved and unavailable for any new nodes.
See the below vmclass documentation for details:
Ultimately this is a resource reserved issue and will need to be resolved based on your environmental needs.
See the below for a few suggestions to reduce the amount of resources set as reserved:
Note: A change to a vmclass for a node-pool will trigger a rolling redeployment across all nodes in that node-pool. This includes changing the vmclass assigned to control plane nodes in a workload cluster.