When deploying VMware Cloud Foundation (VCF), the bring-up process may fail during the Configure the vSphere cluster stage. Specifically, the task Enable the default vSAN Storage Policies fails with an "Invalid virtual machine configuration" error.
VCF Installer displays errors such as Failed to apply default vSAN policy.
In /var/log/vmware/vcf/domainmanager/domainmanager.log, you see entries indicating vob.vsan.clomd.needMoreDisks2.
YYYY=MM-DDTHH:MM:SS+0000 ERROR [vcf_dm,697f932a56f3###########696315cf,f1dc] [c.v.e.s.c.c.v.vsphere.VcManagerBase,dm-exec-25] Task information for future track{"key":"task-###","task":{"_type":"Task","_value":"task-###","_serverGuid":"UUID_of_Virtual_Machine"},"description":{"key":"com.vmware.vim.vpxd.vpx.vmprov.ReconfigureVm","message":"Reconfiguring Virtual Machine on destination host"},"name":{"_wsdlName":"ReconfigVM_Task"},"descriptionId":"VirtualMachine.reconfigure","entity":{"_type":"VirtualMachine","_value":"vm-20","_serverGuid":"UUID_of_Virtual_Machine"},"entityName":"FQDN_OF_VCENTER","state":"error","cancelled":false,"cancelable":false,"error":{"property":"config.vmProfile","_msg":"Invalid virtual machine configuration.","_faultMsg":[{"key":"vob.vsan.clomd.needMoreDisks2","arg":[{"key":"1","value":"0"},{"key":"2","value":"1"},{"key":"3","value":"12"},{"key":"4","value":"0"},{"key":"5","value":"0"},{"key":"6","value":"0"},{"key":"7","value":"0"},{"key":"8","value":"12"},{"key":"9","value":"0"}],"message":"There are currently 0 usable disks for the operation. This operation requires 1 more usable disks. \nRemaining ## disks unusable because: \n 0 - Insufficient space for data/cache reservation. \n 0 - Maintenance mode or unhealthy disks. \n 0 - Disk-version or storage-type mismatch. \n 0 - Max component count reached. \n ## - In unusable fault-domains due to policy constraints.
The log states: Remaining disks unusable because: In unusable fault-domains due to policy constraints, same information can be found in vpxd log of vCenter Server located at (/var/log/vmware/vpxd/vpxd.log)
--> message = "There are currently 0 usable disks for the operation. This operation requires 1 more usable disks.
--> Remaining ## disks unusable because:
--> 0 - Insufficient space for data/cache reservation.
--> 0 - Maintenance mode or unhealthy disks.
--> 0 - Disk-version or storage-type mismatch.
--> 0 - Max component count reached.
--> ## - In unusable fault-domains due to policy constraints.
--> 0 - In witness node."
--> }
--> ],
--> property = "config.vmProfile",
--> msg = "Invalid virtual machine configuration."
--> }
Hosts appear healthy individually but are unable to communicate over the vSAN network.
VMware Cloud Foundation 9.x
This issue is caused by a Network Partition in the vSAN cluster.
The ESXi hosts are unable to communicate over the vSAN VMkernel interfaces, typically due to physical switch configuration issues, incorrect VLAN tagging, or MTU mismatches.
Because the hosts cannot "see" each other, vSAN cannot satisfy the storage policy requirements that mandate data redundancy across multiple nodes.
To resolve this issue, we must must fix the underlying network connectivity between the ESXi hosts:
localcli vsan cluster get
vsanvmk=$(localcli vsan network list | grep -A 1 "vsan" -B 12 | grep -ie "vmknic name:" | tail -1 | awk '{print $3}' |tail +1); vmkmtu=$(localcli network ip interface list |grep "$vsanvmk" -A 13 |tail -1 |awk '{print $2}') ; mxmtu=$(expr $vmkmtu - 28); switchport=$(net-stats -l | grep "$vsanvmk" | awk '{print $1}') ; vswitch_name=$(net-stats -l |grep "$vsanvmk" | awk '{print $4}'); vmnic=$(vsish -e get /net/portsets/"$vswitch_name"/ports/"$switchport"/teamUplink); localcli vsan cluster unicastagent list |tail +3 | awk '{print $4}' |while read unicip; do echo The vSAN vmknic is $vsanvmk on $vswitch_name $vmnic with MTU configured as $vmkmtu vSAN neighbor IP is $unicip; vmkping -I $vsanvmk -s $mxmtu -d $unicip | tail -3 ;done