Oracle Linux VM disk reordering after reboot
search cancel

Oracle Linux VM disk reordering after reboot

book

Article ID: 437785

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • When the multi-writer flag is enabled on an Oracle Linux VM, the guest operating system experiences non-deterministic device naming upon reboot.

  • The below screenshot explicitly highlights two specific disk mappings with command to validate: lsbl

    1. sdc is currently mapped to a 110G disk.

    2. sdi is currently mapped to a 10G disk.

  • Because the Multi-writer flag causes the OS to probe these disks in parallel during boot, the kernel assigns the /dev/sd* paths non-deterministically 
  • Configuration of the Linux VM with the multi-writer flag is as below:
  • Disk details inside the GuestOS are as below where these disks are configured for Oracle Automatic Storage Management (ASM). In a clustered Oracle environment (like Oracle RAC), multiple virtual machines must have concurrent read/write access to the same shared disks. The hypervisor's Multi-writer flag allows this sharing:

Environment

  • VMware vSphere ESXi 8.x

Cause

  • The observed behavior stems from the standard architectural design of the Linux kernel’s SCSI device discovery process. In environments utilizing Multi-writer shared disks on Linux VMs, device node shifting is a common characteristic rather than a singular failure.

     

  • The primary driver is the asynchronous nature of the guest operating system's device scanning. During a reboot, the kernel initiates hardware probes in parallel to optimize boot times. Because these discovery tasks do not follow a serialized, deterministic order, the kernel may not initialize SCSI targets in the same sequence every time. Consequently, a disk assigned to /dev/sdb during one boot cycle may be assigned to /dev/sdc in the next, as the device nodes are allocated on a "first-come, first-served" basis.

     

  • When the kernel initializes, it executes the SCSI scan code by broadcasting "probes" across the SCSI bus. Crucially, the kernel does not wait for a specific disk (e.g., "Disk A") to finalize its initialization before attempting to discover subsequent storage units (e.g., "Disk B"). 

  • In clustered environments such as Oracle RAC or high-availability configurations this phenomenon is formally recognized as Device Name Persistence Failure.

Resolution

  • .To ensure stability in clustered environments, it is recommended to engage the Guest OS vendor (Red Hat) to configure udev rules for oracleasm disks. These rules should be mapped using the unique ID_SERIAL attribute for each individual Oracle disk, ensuring that the same physical device is consistently mapped to the correct disk string regardless of the kernel's boot sequence.