ESXi host non responsive do to RDM Reservation confits.
search cancel

ESXi host non responsive do to RDM Reservation confits.

book

Article ID: 382177

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi host has gone non responsive in vCenter and you find that there are "0x18" sense code reservation conflicts for RDMs. 

 

 

Environment

ESXI 7.x

ESXI 8.x

 

Cause

The issue occurs when virtual machines participating in a clustering solution such as WSFC, Red Hat High Availability Cluster use shared RDMs and SCSI reservations across hosts, and a virtual machine on the other host is the active cluster node holding a SCSI Reservation.  When this happens you can exhaust Hostd, causing the ESXi host to go in to a non responsive state from vCenter . 

 

 

 

 

Resolution


Mark the LUNs as perennially reserved:

  1. Determine which RDM LUNs are part of WSFC, Red Hat High Availability Cluster etc . From the vSphere Client, select a virtual machine that has a mapping to the cluster RDM devices.
  2. Edit your virtual machine settings and navigate to your Mapped RAW LUNs. In this example, Hard disk 2:


     
  3. In the Physical disk, there is the specification of the device in use as RDM (that is, the VML ID).

    Take note of the VML ID, which is a globally unique identifier for your shared device.
     
  4. Identify the naa.id for this VML using this command:  esxcli storage core device list

    For example:

    esxcli storage core device list

    naa.6589cfc000000a17ac02aae02067e747
       Display Name: FreeNAS iSCSI Disk (naa.6589cfc000000a17ac02aae02067e747)
       Has Settable Display Name: true
       Size: 40960
       Device Type: Direct-Access
       Multipath Plugin: NMP
       Devfs Path: /vmfs/devices/disks/naa.6589cfc000000a17ac02aae02067e747
       Vendor: FreeNAS
       Model: iSCSI Disk
       Revision: 0123
       SCSI Level: 6

        Is Pseudo: false
       Status: degraded
       Is RDM Capable: true
       Is Local: false
       Is Removable: false
       Is SSD: false
       Is VVOL PE: false
       Is Offline: false
       Is Perennially Reserved: false
       Queue Full Sample Size: 0
       Queue Full Threshold: 0
       Thin Provisioning Status: unknown
       Attached Filters:
       VAAI Status: supported
       Other UIDs: vml.0100010000303035303536################################
       Is Shared Clusterwide: true
       Is SAS: false
       Is USB: false
       Is Boot Device: false
       Device Max Queue Depth: 128
       No of outstanding IOs with competing worlds: 32
       Drive Type: unknown
       RAID Level: unknown
       Number of Physical Drives: unknown
       Protection Enabled: false
       PI Activated: false
       PI Type: 0
       PI Protection Mask: NO PROTECTION
       Supported Guard Types: NO GUARD SUPPORT
       DIX Enabled: false
       DIX Guard Type: NO GUARD SUPPORT
       Emulated DIX/DIF Enabled: false

  5. Use the esxcli command to mark the device as perennially reserved:

    esxcli storage core device setconfig -d naa.id --perennially-reserved=true

    For example:

    esxcli storage core device setconfig -d naa.6589cfc000000a17ac02aae02067e747 --perennially-reserved=true

    Note: For vSphere 7.x, see the Change Perennial Reservation Settings section of the vSphere Storage Guide.
     
  6. To verify that the device is perennially reserved, run this command:

    esxcli storage core device list -d naa.id

    In the output of the esxcli command, search for the entry Is Perennially Reserved: true. This shows that the device is marked as perennially reserved.

    For example:

    esxcli storage core device list -d naa.6589cfc000000a17ac02aae02067e747

    naa.6589cfc000000a17ac02aae02067e747
       Display Name: FreeNAS iSCSI Disk (naa.6589cfc000000a17ac02aae02067e747)
       Has Settable Display Name: true
       Size: 40960
       Device Type: Direct-Access
       Multipath Plugin: NMP
       Devfs Path: /vmfs/devices/disks/naa.6589cfc000000a17ac02aae02067e747
       Vendor: FreeNAS
       Model: iSCSI Disk
       Revision: 0123
       SCSI Level: 6
       Is Pseudo: false
       Status: degraded
       Is RDM Capable: true
       Is Local: false
       Is Removable: false
       Is SSD: false
       Is VVOL PE: false
       Is Offline: false
       Is Perennially Reserved: true
       Queue Full Sample Size: 0
       Queue Full Threshold: 0
       Thin Provisioning Status: unknown
       Attached Filters:
       VAAI Status: supported
       Other UIDs: vml.0100010000303035303536################################
       Is Shared Clusterwide: true
       Is SAS: false
       Is USB: false
       Is Boot Device: false
       Device Max Queue Depth: 128
       No of outstanding IOs with competing worlds: 32
       Drive Type: unknown
       RAID Level: unknown
       Number of Physical Drives: unknown
       Protection Enabled: false
       PI Activated: false
       PI Type: 0
       PI Protection Mask: NO PROTECTION
       Supported Guard Types: NO GUARD SUPPORT
       DIX Enabled: false
       DIX Guard Type: NO GUARD SUPPORT
       Emulated DIX/DIF Enabled: false

     
  7. Repeat the procedure for each Mapped RAW LUN that is participating in the clustering solution such as WSFC, Red Hat High Availability Cluster, etc.

    Note: The configuration is permanently stored with the ESXi host and persists across restarts. To remove the perennially reserved flag, run this command:

    esxcli storage core device setconfig -d naa.id --perennially-reserved=false
  8. If the host does not come back up after setting perennially-reserved=false  Reboot host to get back into Vcenter. 

Additional Information

RDMs that are not perennially reserved can also cause long boot times. 

ESXi host takes a long time to start during rescan of RDM LUNs