Deploying a Metro Storage Cluster across two sites using HUAWEI OceanStor Dorado All-Flash Storage HyperMetro and VMware vSphere
search cancel

Deploying a Metro Storage Cluster across two sites using HUAWEI OceanStor Dorado All-Flash Storage HyperMetro and VMware vSphere

book

Article ID: 312174

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article describes the uniform and non-uniform topology for using Huawei HyperMetro of OceanStor Dorado All-Flash Storage with the VMware vSphere HA cluster, and provides information about deploying a Metro Storage Cluster across two sites using Huawei HyperMetro and VMware vSphere.



Environment

VMware vSphere ESXi 5.0
VMware ESXi 4.0.x Embedded
VMware vSphere ESXi 6.5
VMware vSphere ESXi 6.0
VMware ESXi 4.1.x Embedded
VMware vSphere ESXi 6.7
VMware ESXi 3.5.x Installable
VMware vSphere ESXi 5.1
VMware ESXi 3.5.x Embedded
VMware ESXi 4.1.x Installable
VMware ESXi 4.0.x Installable
VMware vSphere ESXi 5.5
VMware vSphere ESXi 7.0.0
VMware vSphere ESXi 7.0.x

Resolution

What is HUAWEI OceanStor Dorado All-Flash Storage?

 

Huawei OceanStor Dorado All-Flash Storage are designed to carry mission-critical services of enterprises. The SmartMatrix full-mesh architecture ensures hardware redundancy and fast service switchover without interruption in the event of a fault.

The fully symmetric active-active architecture balances service loads on the entire storage system, simplifying service planning and scaling.

FlashLink® designed for all-flash storage guarantees consistent low latency.

The gateway-free HyperMetro feature provides an end-to-end active-active data center solution, which can be smoothly upgraded to a geo-redundant disaster recovery (DR) solution to achieve 99.9999% solution-level reliability (99.99999% for high-end products).

Variable-length deduplication and compression maximize the available capacity and reduce the operating expense (OPEX).

Huawei OceanStor Dorado All-Flash Storage meets the requirements of enterprise applications such as databases, virtual desktop infrastructure (VDI), virtual server infrastructure (VSI), and file sharing, helping the financial, manufacturing, and carrier industries smoothly transition to all-flash storage

HyperMetro delivers the active-active service capabilities based on two storage arrays. Data on the active-active LUNs at both ends is synchronized in real time, and both ends process read and write I/Os from application servers to provide the servers with non-differentiated parallel active-active access.

What is HyperMetro?

HyperMetro is Huawei's active-active storage solution that enables two storage systems to process services simultaneously, establishing a mutual backup relationship between them. If one storage system malfunctions, the other one will automatically take over services without data loss or interruption. With HyperMetro being deployed, you do not need to worry about your storage systems' inability to automatically switch over services between them and will enjoy rock-solid reliability, enhanced service continuity, and higher storage resource utilization. HyperMetro supports both single-data center (DC) and cross-DC deployment modes.

Dual-write ensures data redundancy on the storage systems. In the event of a storage system or DC failure, services are switched over to the other DC at zero RTO and RPO, ensuring service continuity without any data loss.

Minimum Requirements:

These are the minimum system requirements for a vMSC solution with HyperMetro:
  • Huawei OceanStor Dorado All-Flash Storage 6.0 or newer, Huawei OceanStor Dorado All-Flash Storage V700 or newer, Huawei OceanStor Dorado V3 V300R001 or newer.
  • HUAWEI OceanStor Dorado should be connected with ESXi via FC, FC-NVME, NVMe over RoCE or iSCSI or NFS.
  • ESXi 5.1 and later.
  • Huawei UltraPath, VMware HPP or VMware NMP.
  • vSphere 7.0 or later support FC-NVME and NVMe over RoCE.

Solution Overview:

The VMSC solution supports both uniform access (cross-connected networking) and non-uniform access (parallel networking), depending on how the host accesses the storage system. This chapter describes the differences between the two access methods to help customers choose one that best suits their networks.

Uniform Access (recommended)

Uniform access means that ESXi hosts at each site are connected to both the local and remote storage systems. Any ESXi host has the paths to access both the local and remote storage systems at the same time.

Figure 1 Uniform access network

uniform access network

 


In the preceding figure, host A at site 1 and host B at site 2 have links to both Dorado A and Dorado B, and there are replication links between Dorado A and Dorado B. If site 1 and site 2 are deployed in the same equipment room, the hosts at both sites reach the storage systems at a similar latency. If site 1 and site 2 are in different equipment rooms in the same metropolitan area, the latency for a host to access its local storage system is shorter than that for the host to the remote storage system because the links introduce extra latency. To achieve the optimal performance of active-active storage systems, avoid using the paths to the remote storage system unless absolutely necessary.

If VM 1A sends a write request to volume A on Dorado B, an extra latency equivalent to 2-fold the round-trip time (RTT) on the replication link between the storage systems will be added to the write latency. The write request must be sent from host A to Dorado B, and then Dorado B synchronizes the write request to Dorado A. If the RTT of the replication link is 3 ms, this process increases the write latency by 6 ms, which is unacceptable to enterprises' mission-critical applications.

OceanStor Dorado HyperMetro supports ALUA. When a host delivers a command to query paths, the storage system reports the Active/Optimized (AO) and Active/Non-Optimized (AN) paths. For HyperMetro in the same equipment room, Huawei UltraPath can set a load balancing policy, which configures all paths from the host to both storage systems as AO paths. I/Os are delivered to all of the paths in a round-robin way. For HyperMetro across two equipment rooms over distance, UltraPath sets the paths to the local storage system as AO paths and those to the remote storage system as AN paths. The host preferentially uses the AO paths for reads and writes, and uses the AN paths only when the local storage system is faulty. This is also the case when the host's Native Multipathing Plug-In (NMP) is used. OceanStor Dorado provides configuration items on DeviceManager to set the load balancing or local preference policy. By using the multipathing settings, the extra latency introduced by the replication link is avoided.

Non-Uniform Access

Non-uniform access means that ESXi hosts at each site are connected only to the local storage system. Any ESXi host can access only its local storage system.

Figure 2 Non-Uniform access network
Non-uniform access network
 

In the preceding figure, host A at site 1 only has links to Dorado A, and host B at site 2 only has links to Dorado B. Therefore, VMs on host A can only access Dorado A, and VMs on host B can only access Dorado B. The multipathing software can only identify paths to the local storage system, and all of the paths are Active/Optimized (AO) paths.


In HyperMetro configurations:
  • Two devices on OcenStor Dorado arrays are both Read/Write accessible to hosts.
  • Hosts could read/write to both devices in a HyperMetro Pair.
  • Two devices in a HyperMetro Pair share the same external device identity (geometry, device WWN).

    This shared identity causes the LUN_1 and LUN_2 devices to appear to hosts(s) as a single virtual device across the two arrays.

    Arbitration server:

    HyperMetro supports arbitration by pair or by consistency group. In the event of link or other failures, HyperMetro provides two arbitration modes:
    • Static priority mode: This mode is mainly used in scenarios where no third-party arbitration servers are deployed. In this mode, you can set either end as the preferred site based on active-active pairs or consistency groups and the other end the non-preferred site.
      If the link between the storage arrays or the non-preferred site encounters a fault, LUNs at the preferred site are accessible, and those at the non-preferred site are inaccessible.

      If the preferred site encounters a fault, the non-preferred site is not accessible to hosts.
       
    • Arbitration server mode: In this mode, an independent physical or virtual machine is used as the arbitration device, which determines the type of failure, and uses the information to choose one side of the device pair to remain R/W accessible to the host. The Arbitration server mode is the default option.

    Note: Arbitration server can be built on a physical or virtual machine.

    • The Operating system's support matrix with Huawei storage, please visit Huawei support for more details.
    • Arbitration server could be built on a physical or virtual machine.

    Recommendations and Limitations:

    For LUNs that are shared among multiple hosts, make sure that host LUN IDs are consistent across all hosts(Setting LUN Allocations).
    When using VMware NMP, it is recommended to upgrade to ESXi 6.0 U2 for ESXi 6.0 version. please refer to this VMware KB Storage PDL responses may not trigger path failover in vSphere 6.0 (2144657) 

    It’s also recommended that the user mapping the LUN_2 device to ESXi host only after the Pair is success configured. User can use the rescan command to detect the new path, paths are detected automatically by UltraPath or VMware NMP. There might be some delay in automatic detection. Please refer to this VMware KB article for instructions on updating this tunable: Changing the polling time for datastore paths (1004378).

    For more in depth information of HyperMetro, see the techincal notes on HUAWEI Support.

    A certified configuration of OceanStor Dorado All-Flash Storage is available, and is listed in the VMware Compatibility Guide.

     

    Tested Scenarios:

    This table outlines the tested and supported failure scenarios when using a Huawei OceanStor Dorado All-Flash Storage Cluster for VMware vSphere:

     

    ScenarioOperationObserved VMware behavior (Uniform)Observed VMware behavior (Non-uniform)
    Cross-data-center VM migrationMigrate a VM from site A to site BNo ImpactNo Impact
    Physical server breakdownUnplug the power supply for a host in site AVMware High Availability failover virtual machines to other available hostsVMware High Availability failover virtual machines to other available hosts
    Single-link failure of physical serverUnplug the physical link that connects a host in site A to an FC switchNo ImpactNo Impact
    Storage failure in site AUnplug the power supply for the storage system in site ANo ImpactVMware High Availability failover virtual machines to available site B hosts.
    All-link failure of storage in site AUnplug all service links that connect site A's storage array to an FC switchNo ImpactVMware High Availability failover virtual machines to available site B hosts.
    All-link failure of all hosts in site AUnplug all physical links that connect all hosts in site A to an FC switchVMware High Availability failover virtual machines to available site B hostsVMware High Availability failover virtual machines to available site B hosts.
    Failure of storage replication linksUnplug replication links between sitesNo ImpactVirtual machines in site B hosts are automatically Powered off in site B hosts and Powered on in available site A hosts.(ps: The perfect site is site A)
    Failure of storage management networkUnplug network cable from network port of host in site ANo ImpactNo Impact
    All-link failure between sitesDisconnect the DWDM links between sitesVirtual machines in site B hosts are automatically Powered off in site B hosts
    and Powered on in available site A hosts.(ps: The perfect site is site A)
    Virtual machines in site B hosts are automatically Powered off in site B hosts and Powered on in available site A hosts.(ps: The perfect site is site A)
    Failure of site APower off all devices in site AVMware High Availability failover virtual machines to available site B hostsVMware High Availability failover virtual machines to available site B hosts.
    Failure of site BPower off all devices in site BVMware High Availability failover virtual machines to available site A hostsVMware High Availability failover virtual machines to available site A hosts.


    Additional Information

    Changing the polling time for datastore paths
    LUNs are missing after upgrading the hosts to ESXi 6.5
    简体中文:使用华为 OceanStor Dorado V3 HyperMetro 和 VMware vSphere 跨两个站点部署 Metro 存储群集