vSAN -- Creating/Reverting Snapshots fails -- Fault Domain Error "Out of resources"
search cancel

vSAN -- Creating/Reverting Snapshots fails -- Fault Domain Error "Out of resources"

book

Article ID: 326856

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When trying to create a snapshot of a VM residing on a vSAN datastore, it fails with the error:

 

There are currently 1 usable fault domains. The operation requires 2 more usable fault domains.
An error occurred while taking a snapshot: Out of resources.

Or

No disks of required version is present in the cluster for this operation to succeed. An error occurred while taking a snapshot: Out of resources.


var/log/clomd.log 

 

025-04-15T14:47:00.776Z No(29) clomd[2098465]: [Originator@6876 opID=1804296191] CLOMLogConfigurationPolicy: Object size 322122547200 bytes with policy: (("stripeWidth" i1) ("cacheReservation" i0)("proportionalCapacity" (i0 i100)) (
"hostFailuresToTolerate" i1) ("forceProvisioning" i0) ("spbmProfileId" "aa6d5a82-1c88-45da-85d3-############") ("spbmProfileGenerationNumber" l+0) ("objectVersion" i20) ("CSN" l4584) ("SCSN" l4581) ("spbmProfileName" "vSAN Default Storage Policy"))
2025-04-15T14:47:00.776Z No(29) clomd[2098465]: [Originator@6876 opID=1804296191] CLOMGetMinMaxObjVersion: minVersion: 20 maxVersion: 20
2025-04-15T14:47:00.777Z Cr(26) clomd[2098465]: [Originator@6876 opID=1804296191] CLOM_CheckClusterResourcesForPolicy: Not enough Upper FD's available. Available: 2, needed: 3
2025-04-15T14:47:00.777Z Cr(26) clomd[2098465]: [Originator@6876 opID=1804296191] CLOM_GenerateObjectConfig: Cluster doesn't have resources for the current iteration: objVersion: 20 replicas: 1, stripes: 1
2025-04-15T14:47:00.777Z Cr(26) clomd[2098465]: [Originator@6876 opID=1804296191] CLOMGenerateNewConfig: Failed to generate a configuration: Not found
2025-04-15T14:47:00.777Z Cr(26) clomd[2098465]: [Originator@6876 opID=1804296191] CLOM_Diagnose: No disks of required version is present in the cluster for this operation to succeed.
2025-04-15T14:47:00.777Z Cr(26) clomd[2098465]: [Originator@6876 opID=1804296191] CLOMProcessWorkItem: Failed to generate configuration: Underlying device has no free space
2025-04-15T14:47:00.777Z No(29) clomd[2098465]: [Originator@6876 opID=1804296191] CLOMProcessWorkItem: Op ends:1804296191

Environment

VMware vSAN(All Versions)

Cause

An insufficient number of Fault Domains are currently present in the environment for component placement.
 
Example 1:
3-node Cluster with 3 Fault Domains (1 Host per FD).

One host has been placed in to MM or has failed and when you try to create a snapshot.  

"There are currently 2 usable fault domains. The operation requires 1 more usable fault domains."
 
Example 2:
4-node Cluster with 4 Fault Domains (1 Host per FD). Running Raid-5

One host has been placed in to MM or has failed and when you try to create a snapshot.  

"There are currently 3 usable fault domains. The operation requires 1 more usable fault domains."
 
Example 3:
24-node Cluster with 3 Fault Domains (= 8 Hosts per FD).

All Hosts in one Fault Domain are being shut down. 

"There are currently 2 usable fault domains. The operation requires 1 more usable fault domains."
 
Please refer to the following documentation regarding Fault Domains for further information on vSAN cluster sizing. For safety reasons, in its default configuration vSAN will not allow the provisioning of new objects if there are not enough resources to satisfy the applied storage policy.
 
This could be caused due to a host still in vSAN Decom or hardware failure reducing the number of fault domains.

Resolution

Resolution 

To resolve this, run the below command and make sure there are no hosts in Decom State.

echo "hostname,decomState,decomJobType";for host in $(cmmds-tool find -t HOSTNAME -f json |grep -B2 Healthy|grep uuid|awk -F \" '{print $4}');do hostName=$(cmmds-tool find -t HOSTNAME -f json -u $host|grep content|awk -F \" '{print $6}');decomInfo=$(cmmds-tool find -t NODE_DECOM_STATE -f json -u $host |grep content|awk '{print $3 $5}'|sed 's/,$//');echo "$hostName,$decomInfo";done|sort

Sample output:
hostname,decomState,decomJobType
esxi-1.example.com,0,0
esxi-2.example.com,0,0
esxi-3.example.com,0,0
Anything other than 0 means there is a host in vSAN Decom State.
hostname,decomState,decomJobType
esxi-1.example.com,0,0
esxi-2.example.com,0,0
esxi-3.example.com,6,0 <---

If you find a host in Decom State, place the host into maintenance mode ('No Action' option), and then remove the host from maintenance mode using the host UI or vCenter UI to clear this state. See KB vSAN Host Maintenance Mode is not in sync with vSAN Node Decommission State

Workaround 

The below workaround is only a temporary fix while waiting for failed hardware to be replaced and the advanced setting /VSAN/ClomForceProvisionPlacements should not be left permanently enabled as your Data won't be properly protected for redundancy which could lead to data unavailability or even data loss.  

For vSAN 6.7 onwards the following workaround is available: 

You can set the advanced config option on all Hosts to be able to create/revert snapshots until the Fault Domain issue has been resolved:
esxcfg-advcfg -s 1 /VSAN/ClomForceProvisionPlacements

Once the Fault Domain issue has been resolved, revert the setting back on all Hosts by:
esxcfg-advcfg -s 0 /VSAN/ClomForceProvisionPlacements

Please keep in mind that in this case, objects can be created even if they do not satisfy the requirements of a specific storage policy, so it is recommended to change this back after the host is back online again.

Additional Information

See also: VMware vSAN Design Guide pg9 Designing for Capacity Maintenance and Availability