HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations (Partner Verified and Supported)
search cancel

HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations (Partner Verified and Supported)

book

Article ID: 313453

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides information about Partner Verified and Supported Products (PVSP) for the Serviceguard for Linux Clustering (SGLX) with shared disks   provided and supported by HPE on VMware vSphere 6.x.

Note: The PVSP policy implies that the solution is not directly supported by VMware. For issues with this configuration, contact HPE directly. See the Support Workflow on how partners can engage with VMware. It is the partner's responsibility to verify that the configuration functions with future vSphere major and minor releases, as VMware does not guarantee that compatibility with future releases is maintained.

Disclaimer: The partner products referenced in this article is software that is developed and supported by a partner. Use of these products is also governed by the end user license agreements of the partner. You must obtain the application, support, and licensing for using these products from the partner.

Please refer to the attached document in this KB for complete information regarding HPE Serviceguard support on vSphere 6.x.



    Environment

    VMware vSphere ESXi 6.7
    VMware vSphere ESXi 6.0
    VMware vSphere ESXi 6.5

    Resolution

    VMware vSphere 6.x provides flexibility and choice in architecting high-availability solutions using Linux as guest Operating Systems (OS). VMware vSphere 6.x supports HPE Serviceguard for Linux (SGLX) with shared (clustered) disk resources by transparently passing to the underlying storage SCSI-3 Persistent Reservations (SCSI3-PRs) SCSI commands, required for a Serviceguard (SGLX) node (VM participating in a Serviceguard cluster, further referenced as VM node) to arbitrate access to a shared disk.

    Information in this article is applicable to configurations when VMs hosting nodes of a SGLX cluster (VM nodes) are located on different ESXi hosts – known as “Cluster-across-box (CAB)”. CAB provides High Availability (HA) both from the in-guest and vSphere environment perspective.

    Storage Array to Array Replication with RDM configuration is now supported.

    This KB assumes that all components underneath the SGLX cluster-based VMs are architected to provide the high availability and confirm to Hewlett-Packard Enterprise (HPE) supportability, as it relates to the in-guest configuration of this article. Refer to the Section "Understanding Hardware configurations for Serviceguard for Linux" in the document Managing HPE Serviceguard for Linux which is available at https://www.hpe.com/info/linux-serviceguard-docs


    HPE Serviceguard for Linux (SGLX) Clustering with guest storage and high availability support (Partner Verified and Supported)

    Product Overview

    HPE Serviceguard for Linux® is certified for virtual machines created on VMware ESX/ESXi server running on industry standard x86_64 servers. This section discusses the various approaches a VMware VM can be deployed in a Serviceguard for Linux cluster. The description provided below gives you an overview of cluster configuration steps involving VMs from multiple hosts, as well as a combination of VMs and physical machines, to provide high availability (HA) for your applications. Reasonable expertise in the installation and configuration of HPE Serviceguard for Linux and ESX/ESXi Server, as well as familiarity with their capabilities and limitations, is assumed.
    Serviceguard for Linux provides different types of storage configurations. Shared disk configuration, storage array-based replication configuration and application or database native replication configuration. Information in this article is applicable to configurations when it is done using shared disk.
    For more information about the database native replication and array-based replication configuration refer to the HPE Serviceguard for Linux documentations at https://www.hpe.com/info/linux-serviceguard-docs.
    This KB assumes that all components underneath the SGLX cluster-based VMs are architected to provide the high availability and confirm to Hewlett-Packard Enterprise (HPE) supportability, as it relates to the in-guest configuration of this article. Refer to the Section "Understanding Hardware configurations for Serviceguard for Linux" in the document Managing HPE Serviceguard for Linux which is available at https://www.hpe.com/info/linux-serviceguard-docs .

    Deployment Models for Serviceguard

    Serviceguard clusters  can be configured with shared storage in the below mentioned deployment models:
    Clustering virtual machine across ESXi hosts: In this deployment model, a cluster can be formed with multiple guests hosted on multiple ESXi hosts, where only one guest from each ESXi host is used as a node in a cluster. In other words, one ESXi host can have multiple guests all of which can be part of different clusters, but no two guests from the same ESXi host can belong to the same cluster. This model is also called as Cluster-Across-Box, (CAB).

    Figure 1. Cluster with one VM each from multiple hosts (Cluster-Across-Box)

    Clustering physical machines with virtual machine: In this deployment model, a combination of VMware guests and physical machines can be used as nodes in a Serviceguard cluster. Serviceguard is installed on the VMware guests and physical machines, and a cluster is formed among them.

    Figure 2.  Mix of physical machine and one VM from a host as cluster nodes (Hybrid Cluster)

    Storage array-based replication cluster or Metrocluster deployment model: VMware guests can also be used as cluster nodes in a storage array-based replication cluster or Metrocluster where the VMs are spanning across two different sites.

    image.png
    Figure 3. Cluster with one VM each from multiple hosts with storage array based replication

    vSphere Version Support for Serviceguard

    For the complete list of supported operating systems, vSphere versions, certified configurations, ESX/ESXi Server, and storage with the listed version of HPE Serviceguard for Linux release, please refer to the latest version of “HPE Serviceguard for Linux Certification Matrix” document at https://www.hpe.com/info/linux-serviceguard-docs.
    VMware neither imposes any limitations nor require a certification for applications using SGLX on a supported Linux OS platform. Therefore, any application running on a supported combination of vSphere and Linux OS is supported with no additional considerations. For the supported Guest OS with VMware vSphere refer to the VMware Compatibility Guide.

    VMware vSphere features support for Serviceguard

    The following VMware vSphere features are supported with SGLX cluster:
    • VMware HA.

    • DRS Affinity rules need to be applied to VMs participating in SGLX cluster. When creating a DRS Affinity rule select Separate Virtual Machines.

    • For storage array-based replication configuration the DRS affinity rules must be written to disallow the VMware HA to start the failed VMs across the sites.

    • VMware DRS. See the Support requirements for vMotion of a VM nodes below.

    The following VMware vSphere features are NOT supported with HPE Serviceguard for Linux cluster:

    • Live Storage vMotion.

    • Fault Tolerance (FT).

    • N-Port ID Virtualization (NPIV).

    • Mixed versions of ESXi hosts in a vSphere cluster for production use.

    Support Requirements for vMotion of a Serviceguard VM

    Live vMotion (both user- or DRS-initiated) of VM nodes is supported in vSphere 6.x under the following requirements:

    • The VM virtual hardware version (“VM compatibility”) must be version 11 (vSphere 6.0) or later
    • During Live vMotion initiation (either user or VMware DRS initiated), the virtual machine can stall for a few seconds. In the Serviceguard environment, if the stall time exceeds the MEMBER_TIMEOUT interval, then the guest cluster considers the node to be down which in turn can lead to unnecessary cluster reformation and failover of applications. In most of the configurations a default value of 14 seconds should be sufficient to make cluster operation resilient to temporary stall induced by vMotion. However, host or network resource constraints in the vSphere cluster can extend the VM stall time to be more than the default value specified for MEMBER_TIMEOUT. In this condition MEMBER_TIMEOUT must be increased to a value which is sufficient for allowing migration to succeed in VMware environment without SGLX application failover. For more information on how to change the MEMBER_TIMEOUT value refer to recent version of Managing HPE Serviceguard for Linux A.12.xx.xx document available at  https://www.hpe.com/info/linux-serviceguard-docs.

    • Must configure cluster nodes with a DRS Affinity rule to prevent hosting more than one node of a Serviceguard cluster on any given ESXi host. This means that you must configure N+1 ESXi hosts for successful vMotion operation, where N is the number of Serviceguard nodes.

    • Additionally, for storage array-based replication configurations DRS affinity rules must be configured to keep VMs of Serviceguard cluster on different hosts within the sites, i.e. disallow vMotion of cluster nodes across the sites.

    • The vMotion network must be using a physical network with a transmission speed of 10GE (Ten Gigabit Ethernet) or more. vMotion over a 1GE (One Gigabit Ethernet) is not supported.

    • Shared disk resources must be accessible by the destination ESXi host.

    • Supported version of Linux OS used (refer to latest version of "HPE Serviceguard for Linux Certification Matrix” for the list of supported versions of Linux OS at  https://www.hpe.com/info/linux-serviceguard-docs).

    Supported VMware vSphere features are shown in the Table 1. 

    Table 1. Supported VMware features with various storage configuration's with SGLX

     

    Storage Configuration type

    VMware Features

    VMware HA

    vMotion

    VMware DRS

    Raw Device Mapping

    Yes

    Yes

    Yes

    Raw Device Mapping with storage array based replicationYesYesYes

    VMware vSphere® Virtual Volumes (vVols)

    Yes

    Yes

    Yes


    Configuration of VMware virtual machine for HPE Serviceguard

    For detailed steps and instructions on how to configure, manage, and administer a virtual machine using VMware ESX/ESXi Server, please refer to the VMware document entitled, “vSphere Virtual Machine Administration” from VMware. The resources allocated to the VMs depend on the requirements of the applications deployed on the VMs, as well as the resources available to the host. For configuration limitations, rules and restrictions, sizing, and capacity planning, please refer to, “Configuration Maximums for VMware vSphere® 6.7 or later” from VMware.
    For all provisioning guidelines, please refer to the VMware documentation. For resource planning, please follow the recommendation specified by the OS or application. 

    Multipathing configuration: Path Selection Policy (PSP)

    RDM physical mode, vVols: Round Robin PSP is fully supported. Fixed and MRU PSPs can be used as well, but Round Robbin PSP might provide better performance utilizing all available paths to the storage array.

    Note:

    • While choosing a PSP, consult your storage array vendor for the recommended/supported PSP. For more information, see the Storage/SAN Compatibility Guide.
    • RDM solutions are only supported with vSphere native multipathing plugin (NMP).

    Network configurations

    To avoid single point of failure, HPE Serviceguard for Linux recommends you deploy a highly available network configuration via bonding or NIC teaming of network interface cards and /or redundant heartbeats and data networks.

    Storage Configurations

    If you choose to deploy the application in shared storage mode, then data should be is accessible from all cluster nodes. When using VMware guests as cluster nodes, iSCSI, Fibre Channel (FC) and FCOE devices can be used as shared storage.

    Storage Configurations for cluster of virtual machines across ESXi hosts

    You can create a Serviceguard cluster that consists of two or more virtual machines on two ESXi or more hosts. Use this method for production deployments. Supported shared storage configuration are Raw Device Mapping (RDM) in physical compatibility mode, Raw Device Mapping with storage array based replication in physical compatibility mode and vVol.

    The following section mentions about various storage protocols and configurations for presenting shared storage to a VM.

    Storage Protocols

    With shared storage configuration Serviceguard supports FC, FCoE and Native iSCSI as storage protocols. Please refer to Table 2 for information about the protocols supported with different shared storage configurations.

    Lock LUN Configuration

    In split-brain scenarios of cluster environment, to avoid the multiple instances of cluster running with different incarnations of an application accessing the same disks cluster arbitrator is used. This tie-breaker is known as a cluster lock. The cluster lock is implemented either by means of a lock LUN or a quorum server.

    The cluster lock LUN is a special piece of storage (known as a partition) that is shareable by all nodes in the cluster. When a node obtains the cluster lock, this partition is marked so that other nodes will recognize the lock as “taken.” In VMware environment Serviceguard supports the usage of either a partitioned disk or a whole LUN as a lock LUN.

    Cluster lock LUN configuration is not supported for storage array based replication configurations, alternatively use quorum server.


    For more information on how to configure and  use the lock LUN as cluster lock refer to the recent version of Managing HPE Serviceguard for Linux A.12.xx.xx document available at  https://www.hpe.com/info/linux-serviceguard-docs.

    Perennially Reservations

    VMware recommends implementing perennial reservation for all ESXi hosts hosting VM nodes with pRDMs. Check the following KB https://kb.vmware.com/s/article/1016106 or more details. The content in the KB article is applicable for HPE Serviceguard for Linux as well in addition to the other clustering solutions listed.

    Virtual SCSI Controllers

    1. Mixing non-shared and shared disks. 
      Mixing non-shared and shared disks on a single virtual SCSI adapter is not supported. For example, if the operating system disk (For ex: /dev/sda) is attached to SCSI0:0, the first clustered disk must be attached to SCSI1:0. A VM node of a Serviceguard cluster has the same virtual SCSI controller maximum as an ordinary VM - up to four (4) virtual SCSI Controllers. 

    2. Virtual disks SCSI IDs should be consistent across all VMs hosting nodes of the same Serviceguard Cluster. For example, if Hardisk1 is added with SCSI 1:1 as shared disk on node1, then this disk must be added with same SCSI ID (SCSI 1:1) on all the cluster nodes.

    3. vNVMe controller is not supported for clustered and non-clustered disk (for example, a boot disk must NOT be placed on vNVMe controller).

    4. Multi-writer flag (MWF) must NOT be used.

    5. Use VMware Paravirtual SCSI controller for the best performance.

    For the best performance consider distributing disks evenly across as many SCSI controllers as possible and use VMware Paravirtual (PVSCSI) controller (provides better performance with lower CPU usage and is a preferred way to attach clustered disk resources).

    vSphere options for presenting a shared storage to a VM node are shown in the Table 2. 
     

    Table 2. Supported Storage Configuration Options

    Storage Configuration type

    vSphere version

    Shared disk options

    SCSI bus sharing

    vSCSI Controller type

    Storage Protocol / Technology

    Multipathing Path Selection Policy Supported

    Raw Device Mapping

    vSphere 6.0, 6.5 and 6.7

    RDM physical mode

    physical

    VMware Paravirtual (PVSCSI), LSI Logic SAS

    FC, FCoE, iSCSI

    Round Robin, Fixed, MRU

    Raw Device Mapping with storage array based replicationvSphere 6.0, 6.5 and 6.7RDM physical modephysicalVMware Paravirtual (PVSCSI), LSI Logic SASFC, FCoE, iSCSIRound Robin, Fixed, MRU

    VMware vSphere® Virtual Volumes (vVols)

    vSphere 6.7

    vVols

    physical

    VMware Paravirtual (PVSCSI), LSI Logic SAS

    FC, FCoE, iSCSI

    Round Robin, Fixed, MRU

    Creating and Configuring VM for Serviceguard

    You can create a Serviceguard cluster that consists of two or more virtual machines on two or more ESXi hosts. Follow the steps mentioned below to create a virtual machine and configure shared storage to a VM.

    Creating Virtual Machine

    Follow the step-by-step procedure as described in the section “Creating Virtual Machine for Serviceguard Deployment” in the attached document titled "HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations" in this KB article.

    Configuring Storage to Virtual Machine

    Depending on the storage requirement choose one of the below two methods of shared storage configuration in vSphere.
    Raw Device Mapping - An RDM is a special mapping file in a VMFS volume that manages metadata for its mapped device. For the detailed description, and configuration with RDM refer to the section “Raw Device Mapping Configuration” in the attached document titled "HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations" in this KB article.

    Raw Device Mapping with storage array based replication - In this configuration two different storages are configured one each at primary and secondary data centers. Data is replicated between the storage arrays by means of storage array-based replication. For the detailed description, and configuration refer to the section “RDM Configuration for storage array-based replication deployments” in the attached document titled "HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations" in this KB article.

    VMware vSphere® Virtual Volumes (vVols) - vVols is an integration and management framework that virtualizes SAN/NAS arrays. For detailed description and configuration with vVols refer to the section “VMware vSphere® Virtual Volumes (vVols)” in the attached document titled "HPE Serviceguard for Linux Clustering(SGLX) with shared disks on VMware vSphere 6.x: Guidelines for supported configurations" in this KB article.

    In-guest shared storage configuration

    Maintaining in guest options for storage (such as iSCSI) is up to those implementing the solution and is not visible to ESXi. VMware fully supports a configuration of Serviceguard cluster using in-guest iSCSI (software initiated) initiators protocol, provided that all other configuration meets the documented and supported HPE Serviceguard cluster configuration. Using this configuration in VMware virtual machines is similar to using it in physical environments.
    Note: Only iSCSI devices exposed using iSCSI software initiator are supported with Serviceguard for Linux clustering.

    HPE support policies for a virtualized deployment of Serviceguard for Linux

    For more information about the HPE support, deployments of SGX software refer to HPE Serviceguard for Linux Home Page.

    Note:

    HPE Serviceguard when run with the DLS method (Dynamic Linked Storage) and no shared disks can run into issues where the HPE Serviceguard initiated VM failovers can clash with vSphere HA triggered VM failovers.
    This issue to be precise would be failure reported by HPE Serviceguard when it tries to attach RDM disk to the secondary VM. 

    To prevent such issue, It is recommended to disable the vSphere HA for the HPE Serviceguard protected VM's. 

    Steps to Disable vSphere HA in HPE Serviceguard protected VMs: 

    1. In the vSphere Client, browse to the vSphere HA cluster.

    2. Click the Configure tab.

    3. Under Configuration, select VM Overrides and click Add.

    4. Use the + button to select virtual machines to which to apply the overrides.

    5. Click Next

    6. Under vSphere HA put a check mark against Override in VM Restart Policy

    7. In the drop down menu select Disabled

    8. Click Finish

    9. Repeat this step for all VMs (Primary and Secondary) which are protected by Serviceguard

    Details available in the below document:

    https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.avail.doc/GUID-CFD74742-26EA-4BED-A4FC-4E8F50A46C83.html

    Along with the above steps, there could still be errors if vCenter re-registers these restart Disabled VM's on a different host when the original host goes down. To prevent this from occurring please set the HA advanced option "das.reregisterRestartDisabledVMs" to "false".

    You can follow the steps from this page to set the HA advanced option: https://kb.vmware.com/s/article/2033250


    Additional Information

    • For more information about HPE Serviceguard for Linux refer to recent version of Managing HPE Serviceguard for Linux A.12.xx.xx document available at  https://www.hpe.com/info/linux-serviceguard-docs.
    • For the complete list of supported operating systems, vSphere versions, certified configurations, ESX/ESXi Server, and storage with the listed version of HPE Serviceguard for Linux release, please refer to the latest version of “HPE Serviceguard for Linux Certification Matrix” document at https://www.hpe.com/info/linux-serviceguard-docs.
    Disclaimer: VMware is not responsible for the reliability of any data, opinions, advice, or statements made on third-party websites. Inclusion of such links does not imply that VMware endorses, recommends, or accepts any responsibility for the content of such sites.

    Attachments

    HPE Serviceguard for Linux Clustering (SGLX) with shared disks on VMware vSphere 6.x get_app