Understanding vSAN Stretched Cluster Failure Scenarios
search cancel

Understanding vSAN Stretched Cluster Failure Scenarios

book

Article ID: 394978

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When operating a vSAN stretched cluster environment, administrators need to understand how different failure scenarios impact virtual machine availability and data accessibility. This article provides detailed failure scenario tables showing the expected behavior for various failure types including host failures, site failures, witness failures, partition failures, and inter-site link (ISL) failures.

Administrators may observe:

  • Virtual machines becoming inaccessible during certain failure conditions
  • Different behaviors based on the Site Disaster Tolerance policy configuration
  • Varying impacts depending on whether Secondary Failures to Tolerate (FTT) is configured
  • Questions about data availability when multiple failures occur

Environment

VMware vSAN 8.X

VMware vSAN 7.X

Cause

The behavior during failure scenarios in vSAN stretched clusters is determined by the interaction between the Site Disaster Tolerance policy setting and the Secondary Failures to Tolerate (FTT) configuration. When failures occur, vSAN uses a voting mechanism to determine object availability based on component distribution across sites and the witness host. The specific combination of policy settings directly influences whether objects remain accessible and if virtual machines can continue running or need to be restarted during various failure conditions.

Resolution

Understanding the expected behavior for each failure scenario helps in planning disaster recovery strategies and setting appropriate storage policies. The following tables detail the behavior for each failure type based on policy configuration.

Host Failure Scenarios

Site Disaster Tolerance Secondary FTT VM Location Failure vSAN Behavior VM Behavior
None - Preferred No data redundancy Site A or B Host failure in Site A Objects are inaccessible if the failed host contains one or more components of an object VM cannot be restarted as the object is inaccessible
None - Preferred RAID-1/5/6 Site A or B Host failure in Site A Objects are accessible as there is site-local resilience VM does not need to be restarted unless VM was running on the failed host
Site Mirroring No data redundancy Site A or B Host failure Site A or B Components on failed hosts are inaccessible, read and write IO across ISL without local redundancy and rebuild across ISL VM does not need to be restarted unless VM was running on the failed host
Site Mirroring RAID-1/5/6 Site A or B Host failure Site A or B Components on failed hosts are inaccessible. Read IO locally due to RAID, and rebuild locally VM does not need to be restarted unless VM was running on failed host

Partition Failure Scenarios

Site Disaster Tolerance Secondary FTT VM Location Failure vSAN Behavior VM Behavior
None - Preferred No data redundancy Site B Partition Site B Objects are accessible in Site B VM resides in Site B, does not need to be restarted
Site Mirroring No data redundancy Site A Partition Site A Objects are inaccessible in Site A as the full site is partitioned, and the quorum is lost VM restarted in Site B
Site Mirroring No data redundancy Site B Partition Site A Objects are inaccessible in Site A as the full site is partitioned, and the quorum is lost VM does not need to be restarted as it resides in Site B

Site Failure Scenarios

Site Disaster Tolerance Secondary FTT VM Location Failure vSAN Behavior VM Behavior
None - Preferred No data redundancy Site A Full failure Site A Objects are inaccessible as the full site failed VM cannot be restarted in Site B, as all objects reside in Site A
None - Preferred No data redundancy Site B Full failure Site B Objects are accessible, as only Site A contains objects VM can be restarted in Site A, as that is where all objects reside
Site Mirroring No data redundancy Site A Full failure Site A Objects are inaccessible in Site A as full site failed VM restarted in Site B
Site Mirroring No data redundancy Site B Full failure Site A Objects are inaccessible in Site A as the full site failed VM does not need to be restarted as it resides in Site B
Site Mirroring No data redundancy Site A Full failure in Site A and simultaneous host failure in Site B Objects are inaccessible in Site A. If components reside on the failed host then the object is inaccessible in Site B VM cannot be restarted
Site Mirroring No data redundancy Site A Full failure in Site A and simultaneous host failure in Site B Objects are inaccessible in Site A. If components do not reside on the failed host, then the object is accessible in Site B VM restarted in Site B
Site Mirroring RAID-1/5/6 Site A Full failure in Site A and simultaneous host failure in Site B Objects are inaccessible in Site A, accessible in Site B as there's site-local resiliency VM restarted in Site B

Witness Failure Scenarios

Site Disaster Tolerance Secondary FTT VM Location Failure vSAN Behavior VM Behavior
None - Preferred No data redundancy Site A Witness host failure No impact, witness host is not used as data is not replicated No impact
None - Non-Preferred No data redundancy Site B Witness host failure No impact, the witness host is not used as data is not replicated No impact
Site Mirroring No data redundancy Site A Witness host failure Witness object inaccessible, VM remains accessible VM does not need to be restarted
Site Mirroring No data redundancy Site B Witness host failure Witness object inaccessible, VM remains accessible VM does not need to be restarted
Site Mirroring No data redundancy Site A Full failure Site A and simultaneous Witness Host Failure Objects are inaccessible in Site A and Site B due to quorum being lost VM cannot be restarted
Site Mirroring No data redundancy Site A Full failure Site A followed by Witness Host Failure a few minutes later Pre vSAN 7.0 U3: Objects are inaccessible in Site A and Site B due to quorum being lost VM cannot be restarted
Site Mirroring No data redundancy Site A Full failure Site A followed by Witness Host Failure a few minutes later Post vSAN 7.0 U3: Objects are inaccessible in Site A, but accessible in Site B as votes have been recounted VM restarted in Site B
Site Mirroring No data redundancy Site B Full failure Site B followed by Witness Host Failure a few minutes later Post vSAN 7.0 U3: Objects are inaccessible in Site B, but accessible in Site A as votes have been recounted VM restarted in Site A

Inter-Site Link (ISL) Failure Scenarios

Site Disaster Tolerance Secondary FTT VM Location Failure vSAN Behavior VM Behavior
Site Mirroring No data redundancy Site A Network failure between Site A and B (ISL down) Site A binds with the witness, and objects in Site B become inaccessible VM does not need to be restarted
Site Mirroring No data redundancy Site B Network failure between Site A and B (ISL down) Site A binds with the witness, and objects in Site B become inaccessible VM restarted in Site A

Adaptive Quorum Control

vSAN 7 U3 introduced Adaptive Quorum Control (AQC) to improve data availability during specific failure conditions. This feature maintains data availability of objects during a site failure (or maintenance) followed by subsequent unavailability of the witness host.

In a fully operational stretched cluster, quorum is determined through a voting mechanism that accounts for object components in both sites and the witness host appliance. When a data site experiences a planned or unplanned outage, vSAN adjusts the votes to favor the active site that still has quorum. This adjustment allows sufficient votes to maintain quorum and keeps data available during a planned or unplanned outage of the witness host appliance.

The vote adjustment process may take a few seconds to a few minutes depending on cluster size. As each object completes adjustment, that object can tolerate witness host failure while maintaining availability. This capability does not protect against simultaneous failure of a data site and witness.

Recovery from Complex Failures

In conditions of a double site failure, where one data site fails simultaneously with the witness site, data and VMs become unavailable as they cannot achieve quorum. This protection mechanism prevents updating data in two different locations.

There may be a chance to recover the data in the single remaining site when it is known that the other data site and the witness site are not coming back. For all versions up to and including vSAN 8 U3 (VCF 5.2), this involves contacting Global Support (GS) to determine viability of potential recovery. Please note, this is a best-effort situation and does not guarantee sanity of data inside guest VMs when recovering from stale components.

Additional Information

For vSAN stretched clusters, avoid using a storage policy with locality=none. When using a storage policy with locality=none, the components of the same replica can be spread across both data sites in the cluster. This can result in:

  • Undesired issues during reconfiguring tasks of an object such as storage policy changes
  • Issues when placing a host into maintenance mode with ensure accessibility
  • Possibility of objects going inaccessible during planned maintenance
  • Read locality not being guaranteed as reads may go across data sites via the Inter-Site Link (ISL), resulting in latency

In the case of a storage policy with Site disaster tolerance set to one of the below options with Failures to tolerate set to RAID1/5/6, the writes would be limited to the site to which the locality is set:

  • Dual site mirroring (stretch cluster)
  • None - keep data on Preferred (stretch cluster)
  • None - keep data on Secondary (stretch cluster)

The issue is specific to stretch cluster storage policies set as Site disaster tolerance with either "None - standard cluster" or "None - stretched cluster" with a Failures to tolerate set to RAID1/5/6.

For more details, see: