Understanding how paths to a storage/LUN device are marked as Dead

Products

VMware vSphere ESXi

Issue/Introduction

This article explains the conditions that can lead to a path to a storage/LUN device being marked as Dead.

Environment

VMware vSphere ESXi 7.x

VMware vSphere ESXi 8.x

Resolution

A path to a storage/LUN device can be marked as Dead in these situations:

The ESXi/ESX storage stack determines a path is Dead due to the TEST_UNIT_READY command failing on probing
The ESXi/ESX storage stack receives a Host Status of 0x1 from an HBA driver, which indicates:
- Remote array port has timed out
- Remote array port has dropped from the fabric (RSCN)
- Remote array port has closed IP connection
The ESXi/ESX storage stack marks paths as Dead after a permanent device loss (PDL) check condition returned by the storage array

ESXi storage stack & SCSI command TEST_UNIT_READY

When a SCSI command fails to complete with a Host Status (for example, H:0x5), this cause the system to send SCSI command 0x0 (TEST_UNIT_READY) down the path where the command failed. If the TEST_UNIT_READY command also fails, the path is marked as Dead.

Also, the Disk.PathEval routine issues a TEST_UNIT_READY command down every path every 300 seconds (which is the default). If this command fails to complete, the path is also marked as Dead. TEST_UNIT_READY is issued down Dead paths on the same 300 second interval in case a path becomes available again, at which point the path is marked as On instead of Dead.

In this logging example, TEST_UNIT_READY is initiated instantly due to a failed command:

vmkernel: 116:03:44:19.039 cpu4:4100)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "sym.029025256531353837" state in doubt; requested fast path state update...

Causes:

Bad/dropped frames in the fabric
Noisy device(s) in the fabric throwing a large number of errors
Overloaded storage array controllers
Switch issues (B2B credits exhaustion, memory leaks, bugs)

ESXi storage stack receives a Host Status of 0x1

Remote array port has timed out

If the connected array port has not been responsive for longer than the device timeout for the HBA driver, the HBA driver marks that device as missing. This causes the HBA driver to send a host status of NO_CONNECT (H:0x1) back to the ESXi/ESX storage stack layer, which in turn causes immediate failover to another good path (MRU/Fixed) or the removal of that path from the working paths list that is available for the Round Robin PSP to utilize. This affects paths to all LUNs provided through the array controller.

Logging example on a Emulex driver:
```
<3> rport-13:0-3: blocked FC remote port time out: saving binding
<3>lpfc820 0000:05:00.1: 1:(0):0203 Devloss timeout on WWPN <WWPN> NPort x270024 Data: x0 x7 x0
<3> rport-13:0-2: blocked FC remote port time out: saving binding
<3>lpfc820 0000:05:00.1: 1:(0):0203 Devloss timeout on WWPN <WWPN> NPort x270023 Data: x0 x7 x0
```
Causes:
- Array controller crashed or hung
- Load on array controllers is very high which prevents timely response to SCSI commands
- Switch issues (B2B credits exhaustion, memory leaks, bugs)
Remote array port has dropped from the fabric

If the array port drops gracefully from the fabric, it sends out Registered State Change Notifications (RSCNs) or Logout between N_ports (LOGO) to connected devices or devices in the same fabric zone. If an array port does not gracefully drop from the fabric (for example, it becomes unresponsive), connected devices must wait for their device loss timeout (as described above in Remote array port has timed out). This affects paths to all LUNs provided through the array controller.

Causes:
- Array controller rebooted due to firmware upgrade
- Array controller rebooted to clear an erroneous state
- Array controller rebooted for other maintenance reasons
Remote array port has closed IP connection

In the same way an array controller drops from the fabric, this can also happen on the iSCSI side, and this can be seen as the closure of an iSCSI session on TCP/IP. This affects paths to all LUNs provided through the array controller.

ESXi storage stack and permanent device loss (PDL)

ESXi 5.x introduced permanent device loss (PDL), which was designed to augment the all-paths-down (APD) condition. Arrays that returned specific Check Conditions for I/O being issued to LUNs no longer presented to that initiator are interpreted by the stack as a permanent condition instead of a transient one. For more information, see Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere ESXI

Logging example:

VMW_SATP_ALUA: satp_alua_issueCommandOnPath:661: Path "vmhba3:C0:T0:L0" (PERM LOSS) command 0xa3 failed with status Device is permanently unavailable. H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x25 0x0.