Implementing VMware vSphere Metro Storage Cluster (vMSC) using Dell EMC metro node
search cancel

Implementing VMware vSphere Metro Storage Cluster (vMSC) using Dell EMC metro node

book

Article ID: 339707

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

This article provides information about deploying a Metro Storage Cluster across two data centers using Dell EMC metro node 7.0.1 and above. With vSphere 7.x, Storage Virtualization Device can be supported in a Metro Storage cluster configuration.

 


Environment

VMware vSphere ESXi 7.0.x
VMware vCenter Server 7.0.x

Resolution

What is metro node?

Dell EMC metro node array enable automated business continuity with zero RPO and RTO. True active-active synchronous replication over metro distances with multi-site dual access gives organizations full confidence that their data will always be available and accessible and there is no time associated with recovery.
Metro node provides greater flexibility through multi-platform support, workload granularity and replication to any array. There is zero performance overhead, no duplicate capacity on the array and no additional software required on the host. The VM witness technology provides the ability to automatically initiate an instant site failover. Metro node supports local configurations for continuous application availability, data mobility to non-disruptively relocate workloads and enables storage technology refresh without application downtime.
 

What is vMSC?

vSphere Metro Storage Cluster (vMSC) is new configuration. A storage device configured in the MSC configuration is supported after vMSC certification equivalency approval from VMWare. All supported storage devices are listed on the VMware Storage Compatibility Guide .
Metro node Witness
The metro node Witness enables the metro node solution to improve overall environment availability by arbitrating a pure communication failure between two primary sites, and an actual site failure in a multi-site architecture.
For metro node 7.0.1 and later, the systems can now rely on a component that is known as metro node Witness. The Witness is an optional component that is designed to be deployed in the customer environments where regular preference rule sets are insufficient to provide seamless zero or near-zero RTO storage availability in the presence of site disasters, metro node cluster and inter-cluster failures.
 

Configuration Requirements

These requirements must be satisfied to support this configuration:

  • The maximum round trip latency on both the IP network and the inter-cluster network between the two metro node clusters must not exceed 5 milliseconds round-trip-time for a non-uniform host access configuration and must not exceed 1 millisecond round-trip-time for a uniform host access configuration. The IP network supports the VMware ESXi hosts and the metro node Management Console. The interface between two metro node clusters is IP. Round-trip-time for a non-uniform host access configuration is now supported up to 10 milliseconds for metro node 7.0.1 and later and ESXi 7.0 and later with NMP and PowerPath. For more information on detailed supported configuration, see the latest metro node Dell EMC Simple Support Matrix (ESSM) on elabnavigator .
  • For management and vMotion traffic, the ESXi hosts in both data centers must have a private network on the same IP subnet and broadcast domain. Preferably management and vMotion traffic are on separate networks.
  • Any IP subnet used by the virtual machine that resides on it must be accessible from ESXi hosts in both data centers. This requirement is important so that clients accessing virtual machines running on ESXi hosts on both sides are able to function smoothly upon any VMware HA triggered virtual machine restart events.
  • The data storage locations, including the boot device used by the virtual machines, must be active and accessible from ESXi hosts in both data centers.
  • vCenter Server must be able to connect to ESXi hosts in both data centers.
  • The VMware datastore for the virtual machines running in the ESX Cluster are provisioned on Distributed Virtual Volumes.
  • The maximum number of hosts in the HA cluster must not exceed 64 hosts for ESXi 7.0 and 96 hosts for ESXi 7.0U1.
  • The configuration option auto-resume for metro node Cross-Connect consistency groups must be set to true.
  • FT enabled on the VMs are supported (not on Cluster Witness Server).
  • This configuration is supported on metro node hardware for metro node OS 7.0.1 and later releases.


Notes:
The ESXi hosts forming the VMware HA cluster can be distributed on two sites. HA Clusters can start a virtual machine on the surviving ESXi host, and the ESXi hosts access the Distributed Virtual Volume through storage path at its site.
Metro node 7.0.1 and later version and ESXi 7.0 are tested in this configuration with the metro node Witness.

Solution Overview

A VMware HA/DRS cluster is created across the two sites using ESXi 7.0 hosts and managed by vCenter Server 7.0. The vSphere Management, vMotion, and virtual machine networks are connected using redundant a network between the two sites. It is assumed that the vCenter Server managing the HA/DRS cluster can connect to the ESXi hosts at both sites.

This diagram provides an overview:

Pic1.jpg

 
Based on the host SAN connections to the metro node storage cluster, there are two different types of deployments possible:

Non-uniform Host Access – This type of deployment involves the hosts at either site see the storage volumes through the same site storage cluster only.

This diagram provides an example:

pic_2.jpg


Uniform Host Access (Cross-Connect) – This deployment involves establishing a front-end SAN across the two sites, so that the hosts at one site could see the storage cluster at the same site as well as the other site.

These best practices must be performed for this type of deployment:

  • The front-end zoning should be done in such a manner that an HBA port is zoned to either the local or the remote metro node cluster.
  • For Local and Metro (non-cross connect): DellEMC recommends PowerPath/VE; or NMP VMW_PSP_RR (round-robin) policy with IO limit set to 1.
  • For Metro in cross-connect: Dell EMC strongly recommends PowerPath/VE; or NMP VMW_PSP_RR (round-robin) policy with IO limit set to 1.


This diagram provides an example:

pic_3.jpg

A metro node Metro solution federated across the two data centers provides the distributed storage to the ESXi hosts. It is assumed that the ESXi boot disk is located on the internal drives specific to the hosts and not on the Distributed Virtual Volume itself.

The virtual machine is ideally run on the preferred site of the Distributed Virtual Volume.

This table outlines tested scenarios: 

Scenario

Metro node Behavior

Impact/Observed VMware HA Behavior

Single metro node back-end (BE) path failure

Metro node continues to operate using an alternate path to the same BE Array. Distributed Virtual Volumes exposed to the ESXi hosts have no impact.

None.

Single front-end (FE) path failure

The ESXi server is expected to use alternate paths to the Distributed Virtual Volumes.

None.

BE Array failure at site-A

Metro node continues to operate using the array at site-B. When the array is recovered from the failure, the storage volume at site-A is resynchronized from site-B automatically.

None.

BE array failure at site-B

Metro node continues to operate using the array at site-A. When the array is recovered from the failure, the storage volume at site-B is resynchronized from site-A automatically.

None.

metro node director failure

metro node continues to provide access to the Distributed Virtual Volume through other directors on the same metro node cluster.

None.

Complete site-A failure
(The failure includes all ESXi hosts and the metro node cluster at site-A.)

Metro node continues to serve I/O on the surviving site (site-B). When the metro node at the failed site (site-A) is restored, the Distributed Virtual Volumes are synchronized automatically from the active site (site-B).

Virtual machines running at the failed site fail. VMware HA automatically restarts them on the surviving site. There is no down time if you configure FT on the VMs.

Complete site-B failure
(The failure includes all ESXi hosts and the metro node cluster at site-B.)

Metro node continues to serve I/O on the surviving site (site-A). When the metro node at site-B is restored, the Distributed Virtual Volumes are synchronized automatically from the active site (site-A).

Virtual machines running at the failed site fail. VMware HA automatically restarts them on the surviving site. There is no down time if you configure FT on the VMs.

Multiple ESXi host
failure(s) – Power off

None.

VMware HA restarts the virtual machines on any of the surviving ESXi hosts within the VMware HA Cluster.

Multiple ESXi host
failure(s) – Network disconnect

None.

HA continues to exchange cluster heartbeat through the shared datastore. No virtual machine failovers occur.

ESXi host experiences APD (All Paths down) –

Encountered when the ESXi host loses access to its storage volumes (in this case, metro node Volumes).

None.

In an APD (All Paths Down) scenario, the ESXi host must be rebooted to recover. If the ESXi Server is restarted, this will cause VMware HA to restart the failed virtual machines on other surviving ESXi Servers within the VMware HA cluster.

Metro node  inter-site link failure; vSphere cluster management network intact

Metro node transitions Distributed Virtual Volumes on the non-preferred site to the I/O failure state. On the preferred site, the Distributed Virtual Volumes continue to provide access.

Virtual machines running in preferred site are not impacted.
Virtual machines running in non-preferred site experience I/O failure and show a PDL error. HA fails over these virtual machines on the other site.
In a uniform host access configuration, the virtual machines run without any impact since the ESXi host can still access the distributed volume through the preferred site.

Metro node cluster failure

(The Metro node at either site-A or site-B has failed, but ESXi and other LAN/WAN/SAN components are intact.)

The I/O continues to be served on all the volumes on the surviving site.

The ESXi hosts located at the failed site experience an APD condition. The ESXi hosts needs to be rebooted to recover from the failure.
In a uniform host access configuration, the virtual machines run without any impact since the ESXi host can still access the distributed volume through the preferred site.

Complete dual site failure

Upon restoration of the two sites, the metro node  continues to serve I/O. The best practice is to bring up the BE storage arrays first, followed by metro node.

All virtual machines fail since both sites are down.
The ESXi hosts should be brought up only after the metro node is fully recovered and the Distributed Virtual Volumes are synchronized.
On powering on the ESXi hosts at each site, the virtual machines are restarted and resume normal operations.
The same impact occurs in a uniform hosts access configuration since both sites are down.

Director failure at one site
(preferred site for a given Distributed Virtual Volume) and BE array failure at the other site (secondary site for a given Distributed Virtual Volume)

The surviving metro node directors within the metro node cluster with the failed director continue to provide access to the Distributed Virtual Volumes.
metro node continues to provide access to the Distributed Virtual Volumes using the preferred site BE array.

None.

Metro node inter-site link intact; vSphere cluster management network failure

None.

Virtual machines on each site continue running on their respective hosts since the HA cluster heartbeats are exchanged through the shared datastore.

Metro node inter-site link failure; vSphere cluster management network failure

Metro node fails I/O on the non-preferred site for a given Distributed Virtual Volume. The volumes continue to have access on the Distributed Virtual Volume on its preferred site.

For virtual machines running in preferred site, powered-on virtual machines continue to run.
This is an HA split-brain situation. The non-preferred site thinks that the hosts of the preferred site are dead and tries to restart the powered-on virtual machines of the preferred site.
For virtual machines running in a non-preferred site, these virtual machines see their I/O as failed and the virtual machines fail. These virtual machines can be registered and restarted on the preferred site.
In a uniform hosts access configuration, the virtual machines run without any impact since the ESXi host can still access the distributed volume through the preferred site. The HA heartbeats are exchanged through the datastore.

Metro node Storage volume is unavailable (for example, it is accidentally removed from the storage view or the ESXi initiators are accidentally removed from the storage view)

Metro node continues to serve I/O on the other site where the Volume is available.

If the I/O is running on the lost device, ESXi detects a PDL (Permanent Device Loss) condition. The virtual machine is killed by virtual machine monitor and restarted by HA on the other site.

Metro node inter-site WAN link failure and simultaneous Cluster Witness to site-B link failure

The Metro node fails I/O on the Distributed Virtual Volumes at site-B and continue to serve I/O on site-A.

It has been observed that the virtual machines at the site-B fail. They can be restarted at site-A .
In a uniform hosts access configuration, the virtual machines run without any impact since the ESXi hosts at Site-B can still access the distributed volume through Site-A.

Metro node inter-site WAN link failure and simultaneous Cluster Witness to site-A link failure

The Metro node fails I/O on the Distributed Virtual Volumes at site-A and continues to serve I/O on site-B.

It has been observed that the virtual machines at the site-A fail. They can be restarted at site-B.
In a uniform hosts access configuration, the virtual machines run without any impact since the ESXi hosts at Site-A can still access the distributed volume through Site-B.

Metro node Cluster Witness failure

Metro node continues to serve I/O at both sites.

None.

Metro node Management Server failure

None.

None.

vCenter Server failure

None

No impact to the running virtual machines or HA. However, the DRS rules and virtual machine placements are not in effect.

 

Update History

07/30/2021 – metro node 7.0.1 and ESXi7.0 support