Understanding the storage path failover sequence in VMware ESXi native multipathing
book
Article ID: 321364
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
This article provides information on the VMware ESXi storage native multipathing failover sequence, as it is logged in /var/log/vmkernel.log file and /var/log/messages on the ESXi host.
Note: This document pertains specifically to storage path failover as implemented in the VMware multipathing module, the Native Multipathing Plug-in (NMP). For information about third party multipathing modules, refer to your vendor's documentation.
Environment
VMware vSphere ESXi 7.0
VMware vSphere ESXi 8.0
Resolution
Note: The example scenario in this article uses a S/W iSCSI initiator and a LUN with identifier naa.60060160d5c12200ccd66fd74a81de11.
The VMware ESXi storage multipathing failover sequence is:
The connection along a given path is detected as down or offline. For example:
vmkernel: 188:04:24:16.970 cpu8:4288)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba33:CH:0 T:1 CN:0: iSCSI connection is being marked "OFFLINE"
The ESXi host stops its iSCSI session. For example:
vmkernel: 188:04:24:16.970 cpu1:4286)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x41000716a200) to NMP device "naa.60060160d5c12200ccd66fd74a81de11" failed on physical path "vmhba33:C0:T1:L7" H:0x1 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x1.
Once the NMP receives this host status, it will send a TEST_UNIT_READY(TUR)command down that path to confirm that it is down, before initiating a failover. For example:
vmkernel: 188:04:24:16.970 cpu1:4286)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160d5c12200ccd66fd74a81de11": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
If this command also fails, the ESXi host's Path Selection Policy (PSP) activates the next path for the device (LUN). For example:
vmkernel: 188:04:24:16.989 cpu1:4131)vmw_psp_mru: psp_mruSelectPathToActivateInt: Changing active path from vmhba33:C0:T1:L7 to vmhba33:C0:T0:L7 for device "naa.60060160d5c12200ccd66fd74a81de11".
This line indicates that the path change was successful. The NMP retries the queued commands down this path to ensure they complete successfully, despite a failover condition being triggered. For example:
The initial commands may not immediately complete on failover (for example, if the LUN still has pending reservations). ESXi host sends a LUN reset if there is a pending SCSI reservation against the device or LUN. This ensures that the SCSI-2 based reservation from the previous initiator is broken, so that the ESXi host can resume I/O upon failover. For example:
vmkernel: 188:04:24:17.974 cpu12:4108)WARNING: NMP: nmp_CompleteRetryForPath: Retry command 0x28 (0x41000716a200) to NMP device "naa.60060160d5c12200ccd66fd74a81de11" failed on physical path "vmhba33:C0:T0:L7" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0
This translates to:
Host Status = 0x0 = OK Device Status = 0x2 = Check Condition Plugin Status = 0x0 = OK Sense Key = 0x6 = UNIT ATTENTION Additional Sense Code/ASC Qualifier = 0x29/0x0 = POWER ON OR RESET OCCURRED
At this stage, the ESXi host can retry the next command in the queue:
Sep 10 13:11:18 laesx01 vmkernel: 188:04:24:17.974 cpu12:4108)WARNING: NMP: nmp_CompleteRetryForPath: Retry world on with device "naa.60060160d5c12200ccd66fd74a81de11" - retry the next command in retry queue Sep 10 13:11:18 laesx01 vmkernel: 188:04:24:17.974 cpu12:4108)ScsiDeviceIO: 747: Command 0x28 to device "naa.60060160d5c12200ccd66fd74a81de11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. Sep 10 13:11:18 laesx01 vmkernel: 188:04:24:17.974 cpu11:4247)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa.60060160d5c12200ccd66fd74a81de11" - issuing command 0x41000706fa00
Indication that the path failover was successful and commands are able to complete via the new path looks similar to:
vmkernel: 188:04:24:17.975 cpu12:4108)NMP: nmp_CompleteRetryForPath: Retry world recovered device "naa.60060160d5c12200ccd66fd74a81de11"
Finally, as this is a S/W iSCSI-based example, you also see the session marked "ONLINE" again: