Resolving SCSI reservation conflicts
search cancel

Resolving SCSI reservation conflicts

book

Article ID: 323126

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

When experiencing any of the following symptoms on an ESXi host, SCSI reservation conflicts may be preventing access to storage devices:

  • Unable to access one or more datastores that were previously accessible
  • A host unexpectedly loses connection to a datastore
  • Unable to extend or increase the size of a datastore after it has been expanded at the storage level
  • Commands like vdf never complete or fail with timeout errors
  • Virtual machines on the affected datastores become inaccessible
  • The following error appears when attempting to access a LUN:
  • Error messages in logs containing RESERVATION CONFLICT or Error status H:0x0 D:0x18 P:0x0

Failure reading partition table from device [naa.xxxxx]: I/O error

These issues can occur even when the LUN is properly zoned, presented, and configured.

Environment

vSphere ESXi 6.x
vSphere ESXi 7.x
vSphere ESXi 8.x

Cause

SCSI reservation conflicts occur when one host in a cluster has placed a SCSI reservation on a LUN and has not released it. This can happen due to:

  • A SAN switch reboot during a reservation operation
  • Path failures during storage operations
  • Networking issues interrupting storage communications
  • Hardware failures

Resolution

Follow these steps to identify and resolve SCSI reservation conflicts:

Verify that the LUN is detected by the ESXi host by running the command:

esxcfg-scsidevs -c


You should see output similar to:

Device UID                  Device Type    Console Device  Size       Plugin    Display Name
eui.################ Direct-Access   /dev/sda       70007MB    NMP       Local FUJITSU Disk (eui.################)
mpx.vmhba#:C#:T#:L# CD-ROM         /dev/sr0       0MB        NMP       Local HL-DT-ST CD-ROM (mpx.vmhba#:C#:T#:L#)
mpx.vmhba#:C#:T#:L# Direct-Access   /dev/cciss/c0d0 34727MB   NMP       Local VMware Disk (mpx.vmhba#:C#:T#:L#)


If the LUN is not listed, rescan the storage adapters.

  1. Check system logs for SCSI Reservation Conflict errors by examining the vmkernel log:

    cat /var/log/vmkernel.log

    Look for entries similar to:

    scsi0 (0,0,52) : RESERVATION CONFLICT
    NMP: nmp_ResetDeviceLogThrottling:3778: Error status H:0x0 D:0x18 P:0x0 Sense Data: 0x0 0x0 0x0 from dev "naa.############################"

    The Device Status code 0x18 indicates a RESERVATION CONFLICT

  2. Identify which host has a reservation on the LUN by running the following command on each host in the cluster:

    esxcfg-info | egrep -B5 "s Reserved|Pending"

    Look for output showing a non-zero "Pending Reservations" value:

    |----Console Device....................../dev/sda
    |----DevfsPath........................../vmfs/devices/disks/vml.######################################################
    |----SCSI Level..........................6
    |----Queue Depth.........................128
    |----Is Pseudo...........................false
    |----Is Reserved.........................false
    |----Pending Reservations................ 1

    The host showing "Pending Reservations" with a value greater than 0 is holding the lock.

  3. Clear the reservation by performing a LUN reset on the affected device:
    vmkfstools --lock lunreset /vmfs/devices/disks/vml.######################################################


  4.  Clear the reservation by performing a LUN reset on the affected device:

    vmkfstools --lock lunreset /vmfs/devices/disks/vml.######################################################

        Replace the device path with the actual path to your device. For NAA IDs, use:

    vmkfstools --lock lunreset /vmfs/devices/disks/naa.############################

      5. Verify the lock was cleared by running the command again

          esxcfg-info | egrep -B5 "s Reserved|Pending"

          Ensure that "Pending Reservations" is now 0 and "Is Reserved" is false

   6. Rescan storage on all hosts in the cluster by: 

    •  Using the vSphere Client: Navigate to each host > Configure tab > Storage adapters > Click "Rescan All     
    •  Or using the command line: esxcli storage core adapter rescan --all

      7.   Verify datastore access is restored and operations like extending datastores can now be completed

      8.    If using ESXTOP to monitor reservation conflicts:

    • Run esxtop
    • Press u to display the device list
    • Press f to add fields
    • Press H for Reserve State and press Enter
    • Look at RESV/s (Reservation / sec) and CONS/s (conflicts / sec)

If the error persists after following these steps, contact Broadcom Support for further assistance.

 

Additional Information

In relation to step 2:

For an RDM LUN added as a disk to a VM cluster, we expect that on the active node, a SCSI reservation is held against the LUN as follows:

esxcfg-info -a | egrep "Is Reserved|Pending"
...
|----Is Reserved.........................true
|----Pending Reservations................ 0

On the hosts, on which passive VM nodes run, and on remaining hosts, we expect to see:
esxcfg-info -a | egrep "Is Reserved|Pending"
...
|----Is Reserved.........................false
|----Pending Reservations................ 0

As such, for an RDM LUN, where there is a stale SCSI reservation (leading to SCSI reservation conflicts), if the host holding the reservation has not released it, we would expect to see the first pattern ("Is Reserved" = True).

For more information, see https://bugzilla-vcf.lvn.broadcom.net/show_bug.cgi?id=3336065#c4 

--- 

Troubleshooting LUN connectivity issues on ESXi hosts

Performing a rescan of the storage on an ESXi host