This article describes the uniform and non-uniform topology for using Huawei HyperMetro of OceanStor Dorado All-Flash Storage with the VMware vSphere HA cluster, and provides information about deploying a Metro Storage Cluster across two sites using Huawei HyperMetro and VMware vSphere.
Huawei OceanStor Dorado All-Flash Storage are designed to carry mission-critical services of enterprises. The SmartMatrix full-mesh architecture ensures hardware redundancy and fast service switchover without interruption in the event of a fault.
The fully symmetric active-active architecture balances service loads on the entire storage system, simplifying service planning and scaling.
FlashLink® designed for all-flash storage guarantees consistent low latency.
The gateway-free HyperMetro feature provides an end-to-end active-active data center solution, which can be smoothly upgraded to a geo-redundant disaster recovery (DR) solution to achieve 99.9999% solution-level reliability (99.99999% for high-end products).
Variable-length deduplication and compression maximize the available capacity and reduce the operating expense (OPEX).
Huawei OceanStor Dorado All-Flash Storage meets the requirements of enterprise applications such as databases, virtual desktop infrastructure (VDI), virtual server infrastructure (VSI), and file sharing, helping the financial, manufacturing, and carrier industries smoothly transition to all-flash storageHyperMetro is Huawei's active-active storage solution that enables two storage systems to process services simultaneously, establishing a mutual backup relationship between them. If one storage system malfunctions, the other one will automatically take over services without data loss or interruption. With HyperMetro being deployed, you do not need to worry about your storage systems' inability to automatically switch over services between them and will enjoy rock-solid reliability, enhanced service continuity, and higher storage resource utilization. HyperMetro supports both single-data center (DC) and cross-DC deployment modes.
Dual-write ensures data redundancy on the storage systems. In the event of a storage system or DC failure, services are switched over to the other DC at zero RTO and RPO, ensuring service continuity without any data loss.
The VMSC solution supports both uniform access (cross-connected networking) and non-uniform access (parallel networking), depending on how the host accesses the storage system. This chapter describes the differences between the two access methods to help customers choose one that best suits their networks.
Uniform access means that ESXi hosts at each site are connected to both the local and remote storage systems. Any ESXi host has the paths to access both the local and remote storage systems at the same time.
Figure 1 Uniform access network
In the preceding figure, host A at site 1 and host B at site 2 have links to both Dorado A and Dorado B, and there are replication links between Dorado A and Dorado B. If site 1 and site 2 are deployed in the same equipment room, the hosts at both sites reach the storage systems at a similar latency. If site 1 and site 2 are in different equipment rooms in the same metropolitan area, the latency for a host to access its local storage system is shorter than that for the host to the remote storage system because the links introduce extra latency. To achieve the optimal performance of active-active storage systems, avoid using the paths to the remote storage system unless absolutely necessary.
If VM 1A sends a write request to volume A on Dorado B, an extra latency equivalent to 2-fold the round-trip time (RTT) on the replication link between the storage systems will be added to the write latency. The write request must be sent from host A to Dorado B, and then Dorado B synchronizes the write request to Dorado A. If the RTT of the replication link is 3 ms, this process increases the write latency by 6 ms, which is unacceptable to enterprises' mission-critical applications.
OceanStor Dorado HyperMetro supports ALUA. When a host delivers a command to query paths, the storage system reports the Active/Optimized (AO) and Active/Non-Optimized (AN) paths. For HyperMetro in the same equipment room, Huawei UltraPath can set a load balancing policy, which configures all paths from the host to both storage systems as AO paths. I/Os are delivered to all of the paths in a round-robin way. For HyperMetro across two equipment rooms over distance, UltraPath sets the paths to the local storage system as AO paths and those to the remote storage system as AN paths. The host preferentially uses the AO paths for reads and writes, and uses the AN paths only when the local storage system is faulty. This is also the case when the host's Native Multipathing Plug-In (NMP) is used. OceanStor Dorado provides configuration items on DeviceManager to set the load balancing or local preference policy. By using the multipathing settings, the extra latency introduced by the replication link is avoided.
Non-Uniform Access
Non-uniform access means that ESXi hosts at each site are connected only to the local storage system. Any ESXi host can access only its local storage system.
Figure 2 Non-Uniform access networkIn the preceding figure, host A at site 1 only has links to Dorado A, and host B at site 2 only has links to Dorado B. Therefore, VMs on host A can only access Dorado A, and VMs on host B can only access Dorado B. The multipathing software can only identify paths to the local storage system, and all of the paths are Active/Optimized (AO) paths.
This shared identity causes the LUN_1 and LUN_2 devices to appear to hosts(s) as a single virtual device across the two arrays.
Arbitration server:
HyperMetro supports arbitration by pair or by consistency group. In the event of link or other failures, HyperMetro provides two arbitration modes:Note: Arbitration server can be built on a physical or virtual machine.
Recommendations and Limitations:
For LUNs that are shared among multiple hosts, make sure that host LUN IDs are consistent across all hosts(Setting LUN Allocations).It’s also recommended that the user mapping the LUN_2 device to ESXi host only after the Pair is success configured. User can use the rescan command to detect the new path, paths are detected automatically by UltraPath or VMware NMP. There might be some delay in automatic detection. Please refer to this VMware KB article for instructions on updating this tunable: Changing the polling time for datastore paths (1004378).
For more in depth information of HyperMetro, see the techincal notes on HUAWEI Support.
A certified configuration of OceanStor Dorado All-Flash Storage is available, and is listed in the VMware Compatibility Guide.
Tested Scenarios:
This table outlines the tested and supported failure scenarios when using a Huawei OceanStor Dorado All-Flash Storage Cluster for VMware vSphere:
Scenario | Operation | Observed VMware behavior (Uniform) | Observed VMware behavior (Non-uniform) |
Cross-data-center VM migration | Migrate a VM from site A to site B | No Impact | No Impact |
Physical server breakdown | Unplug the power supply for a host in site A | VMware High Availability failover virtual machines to other available hosts | VMware High Availability failover virtual machines to other available hosts |
Single-link failure of physical server | Unplug the physical link that connects a host in site A to an FC switch | No Impact | No Impact |
Storage failure in site A | Unplug the power supply for the storage system in site A | No Impact | VMware High Availability failover virtual machines to available site B hosts. |
All-link failure of storage in site A | Unplug all service links that connect site A's storage array to an FC switch | No Impact | VMware High Availability failover virtual machines to available site B hosts. |
All-link failure of all hosts in site A | Unplug all physical links that connect all hosts in site A to an FC switch | VMware High Availability failover virtual machines to available site B hosts | VMware High Availability failover virtual machines to available site B hosts. |
Failure of storage replication links | Unplug replication links between sites | No Impact | Virtual machines in site B hosts are automatically Powered off in site B hosts and Powered on in available site A hosts.(ps: The perfect site is site A) |
Failure of storage management network | Unplug network cable from network port of host in site A | No Impact | No Impact |
All-link failure between sites | Disconnect the DWDM links between sites | Virtual machines in site B hosts are automatically Powered off in site B hosts and Powered on in available site A hosts.(ps: The perfect site is site A) | Virtual machines in site B hosts are automatically Powered off in site B hosts and Powered on in available site A hosts.(ps: The perfect site is site A) |
Failure of site A | Power off all devices in site A | VMware High Availability failover virtual machines to available site B hosts | VMware High Availability failover virtual machines to available site B hosts. |
Failure of site B | Power off all devices in site B | VMware High Availability failover virtual machines to available site A hosts | VMware High Availability failover virtual machines to available site A hosts. |