VMs are reporting as inaccessible due to Single fault in vSAN environment
Example
6 node cluster with 5 data nodes and 1 compute node. Single disk failure.
Object has been laid out as a FTT-0 due to forced provisioning even with a RAID 6/FTT-2 policy being used on a 5 node cluster.
Object UUID: ccda0267-ec69-75cd-8d14-############:
Version: 14
Health: inaccessible - Lost data availability.(APD)
Owner: HOST
Size: 0.00 GB
Used: 0.00 GB
Used 4K Blocks: 0.00 GB
Policy: stripeWidth: 1
proportionalCapacity: [0, 100]
hostFailuresToTolerate: 2
forceProvisioning: 1
spbmProfileId: aa6d5a82-1c88-45da-85d3-############
spbmProfileGenerationNumber: 4
replicaPreference: Capacity
CSN: 4036
SCSN: 3186
spbmProfileName: vSAN Default Storage Policy
Configuration:
Concatenation
Component: 7aec8b68-f466-2dbc-cd21-############
Component State: ABSENT, Address Space(B): 273804165120 (255.00GB), Disk UUID: 5262d69c-060f-d321-bc17-############, Disk Name: N/A
Votes: 2, Host UUID: None
Component: d7284a68-e4d7-4cc0-d5b4-############
Component State: ACTIVE, Address Space(B): 48318382080 (45.00GB), Disk UUID: 526dd29e-2f19-7ff4-080a-############, Disk Name: naa.################:2
Votes: 1, Capacity Used(B): 12582912 (0.01GB), Physical Capacity Used(B): 4194304 (0.00GB), Total 4K Blocks Used(B): 4349952 (0.00GB), Host Name: host1.doman.com
VMware vSAN (all version)
Per the vSAN Design Guide Page #23
The Force provisioning policy allows vSAN to violate the NumberOfFailuresToTolerate (FTT) ,
NumberOfDiskStripesPerObject (SW) and FlashReadCacheReservation (FRCR) policy settings during the initial deployment of
a virtual machine.
vSAN will attempt to find a placement that meets all requirements. If it cannot, it will attempt a much simpler placement with
requirements reduced to FTT=0, SW=1, FRCR=0. This means vSAN will attempt to create an object with just a single mirror.
Any ObjectSpaceReservation (OSR) policy setting is still honored.
vSAN does not gracefully try to find a placement for an object that simply reduces the requirements that cannot be met. For
example, if an object asks for FTT=2, if that cannot be met, vSAN will not try FTT=1, but instead immediately tries FTT=0.
Similarly, if the requirement was FTT=1, SW=10, but vSAN does not have enough capacity devices to accommodate SW=10,
then it will fall back to FTT=0, SW=1, even though a policy of FTT=1, SW=1 may have succeeded.
Caution : Another special consideration relates to entering Maintenance Mode in full data migration mode, as well as
disk/disk group removal with data migration.If an object is currently non-compliant due to force provisioning (either because
initial placement or policy reconfiguration could not satisfy the policy requirements), then "Full data evacuation" of such an
object will actually behave like "Ensure Accessibility", i.e. the evacuation will allow the object to have reduced availability,
exposing it a higher risk. This is an important consideration when using force provisioning, and only applies for non-compliant
objects.
Open a case with VMware Support to investigate any inaccessible objects.
To prevent an event in the future disable forced provisioning and / or choose a policy that is compatible with you environment.