This article outlines considerations for planned (non-accidental) disconnect of a site within vSAN Stretched Cluster.
Example use case may include replacement (or upgrade) of network equipment, planned downtime for power delivery systems, etc.
Assumptions: only single data site will be isolated from both second site as well as Witness Appliance. Other site will remain connected to with Witness Appliance.
vSAN Stretched Cluster
Approach to scheduled site downtime will vary, depending on requirements:
a) No downtime for Virtual Machines residing within the site in question:
- Make sure objects are not pinned to a single site ('Primary Failures to Tolerate = 0' and 'Keep data on Preferred/Secondary (stretched cluster)' availability rules in Storage Policy).
- Check object's health via Skyline Health before the activity and make sure that no objects are in reduced state.
- vMotion Virtual Machines off to the site of the cluster that will stay online during the works.
- Consider powering off non-critical Virtual Machines if moving entire workload would potentially constrain other site from compute/memory perspective.
b) Planned downtime for Virtual Machines within a site:
- Check object's health via Skyline Health before the activity and make sure that no objects are in reduced state.
- Power off Virtual Machines within the site.
NOTE: there is no need for Maintenance Mode of hosts within the site undergoing infrastructure downtime as long as:
- ESXi hosts within the site will remain connected to each other.
- Inter-site connectivity loss and reconvergence are believed to be 'clean', i.e.: with no flapping observed during the transition.