Analyzing SCSI Reservation conflicts on VMware Infrastructure 3.x, vSphere 4.x, vSphere 5.x and vSphere 6.0
search cancel

Analyzing SCSI Reservation conflicts on VMware Infrastructure 3.x, vSphere 4.x, vSphere 5.x and vSphere 6.0

book

Article ID: 328557

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere ESX 4.x - View VMware vSphere ESX 5.x VMware vSphere ESX 6.x VMware vSphere ESXi 5.0 VMware vSphere ESXi 5.5

Issue/Introduction

  • ESX 3.x or ESX 4.x VMkernel logs contain these messages:

    SCSI: vm 1043: 5522: Sync CR at 64
    SCSI: vm 1043: 5522: Sync CR at 48
    SCSI: vm 1043: 5522: Sync CR at 32
    SCSI: vm 1043: 5522: Sync CR at 16
    SCSI: vm 1043: 5522: Sync CR at 0
    WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts
    WARNING: SCSI: 5628: status SCSI reservation conflict, rstatus 0xc0de01 for vmhba1:0:7. residual R 919, CR 0, ER 3
    WARNING: J3: 1970: Error committing txn to slot 0: SCSI reservation conflict

  • ESXi 4.x, ESXi 5.x and ESXi 6.0 vmkernel logs contains below messages

    Vendor: EMC Model: SYMMETRIX Rev: 5771
    Type: Direct-Access ANSI SCSI revision: 03
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    sdh : READ CAPACITY failed.
    status = c, message = 00, host = 0, driver = 00
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    scsi0 (0,0,52) : RESERVATION CONFLICT
    VMWARE: Device that would have been attached as scsi disk sdh at scsi0,channel 0, id 0, lun 52 Has not been attached because this path could not complete a READ command eventhough a TUR worked.result = 0x18 key = 0x0, asc = 0x0, ascq = 0x0
    VMWARE: Device that would have been attached as scsi disk sdh at scsi0,channel 0, id 0, lun 52 Has not been attached because it is a duplicate path or on a passive path
    scan_scsis starting finish
    scan_scsis done with finish

Resolution

Note: The Atomic Test and Set (ATS) primitive is used for locking on Virtual Machine File System (VMFS) datastores for VMware vSphere Storage APIs for Array Integration (VAAI) compatible storage arrays. It is far superior to the SCSI Reservation locking technique. For more information on VAAI, see Frequently Asked Questions for vStorage APIs for Array Integration (1021976).

There are two main categories of operation under which VMFS makes use of SCSI reservations.

  • Category 1: is for VMFS data-store level operations. These include opening, creating, resignaturing, and expanding/extending of VMFS data-store.
  • Category 2: involves the acquisition of locks. These are locks related to VMFS specific meta-data (called cluster locks) and locks related to files (including directories). Operations in the second category occur much more frequently than operations in the first category.

These are some examples of VMFS operations that require locking metadata:

  • Creating a VMFS datastore
  • Expanding a VMFS datastore onto additional extents
  • Powering on a virtual machine
  • Acquiring a lock on a file
  • Creating or deleting a file
  • Creating a template
  • Deploying a virtual machine from a template
  • Creating a new virtual machine
  • Migrating a virtual machine with vMotion
  • Growing a file, for example, a snapshot file or a thin provisioned virtual disk
  • For the zeroed thick type of virtual disk the reservation is required only when zeroing the blocks.
If the VMware VMkernel log contains the messages described in the Details section, perform these procedures:
  1. If the VMware ESX version is:
    • 3.0.1, install Patch ESX-1002960: Fix for SCSI Reservation Conflict Issue. 
    • 3.0.2, install Patch ESX-1002974: Fix for SCSI Reservation Conflicts.
  2. Follow these steps to resolve potential sources of the reservation:
    • Try to serialize the operations of the shared LUNs. If possible, limit the number of operations on different hosts that require SCSI reservation at the same time.
    • Increase the number of LUNs and try to limit the number of ESX hosts accessing the same LUN.
    • Reduce the number of snapshots as they cause a lot of SCSI reservations.
    • Do not schedule backups (VCB or console based) in parallel from the same LUN.
    • Schedule antivirus or operating system updates outside normal business hours so that it does not interfere with daily operations.
    • Try to reduce the number of virtual machines per LUN.
  • Determine what targets are being used to access the LUNs.
  • Check if you have the latest HBA firmware across all ESX hosts.
  • Ensure that the ESX is running the latest BIOS (to avoid conflict with HBA drivers).
  • Contact your SAN vendor for information on SP timeout values and performance settings and storage array firmware.
  • Turn off third-party agents (storage agents) and RPMs not certified for ESX.
  • Disable any encryption re-keying on the fiber channel switches, as this can cause loss of access for more than 60 seconds.
  • Check for MSCS RDMs (active node holds permanent reservation). For more information, see ESX servers hosting passive MSCS nodes report reservation conflicts during storage operations (1009287).
  • Ensure that you have selected the correct Host Mode setting on the SAN array.
  • Rescan. LUNs removed from the system without rescanning can appear as locked.
  • Look for failed SPs. When SPs fail to release the reservation, either the request did not come through (hardware, firmware, pathing problems) or third-party apps running on the service console did not send the release. Busy virtual machine operations are still holding the lock.

Note: Use of SATA disks is not recommended in high I/O configuration or when the above changes do not resolve the problem while SATA disks are used.

If your array is not listed above and none of the above points eliminate the log messages, open a case with Broadcom Support and note this KB Article ID (328557) in the problem description.