I/Os might fail on some paths of device or datastore creation may fail if there is a flaky path
book
Article ID: 345238
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
To resolve the I/O error.
Symptoms:
I/Os might fail on some paths of a device due to faulty switch error
Datastore creation on a device having one flaky path fails with below error
The "Create VMFS datastore" operation failed for the entity with the following error message. An error occurred during host configuration. Operation failed, diagnostics report: Unable to create Filesystem, please see VMkernel log for more details: Failed to create VMFS on device The vmkernel.log will report messages as below 2018-08-27T06:32:09.964Z cpu40:1001390104)NMP: nmp_ThrottleLogForDevice:3781: H:0x7 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL. cmdId.initiator=0x43064aecc6c0 CmdSN 0x9a 2018-08-27T06:32:09.964Z cpu40:1001390104)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.600601603b303c009271dacce40de811" state in doubt; requested fast path state update... 2018-08-27T06:32:09.964Z cpu40:1001390104)ScsiDeviceIO: SCSICompleteDeviceCommand:3294: Cmd(0x45a2ddcbd080) 0x2a, CmdSN 0x9a from world 1001393007 to dev "naa.600601603b303c009271dacce40de811" failed H:0x7 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. 2018-08-27T06:32:09.976Z cpu40:1001390104)vmw_psp_rr: psp_rrCommandComplete:1972: Setting path vmhba4:C0:T1:L15 as flaky, cmd=0x45a2ddcbd080 2018-08-27T06:32:09.976Z cpu40:1001390104)vmw_psp_rr: psp_rrCommandComplete:1990: Setting eval time for path vmhba4:C0:T1:L15 as 10sec, cmd=0x45a2ddcbd080 2018-08-27T06:32:09.989Z cpu40:1001393041 opID=313e2b0b)ScsiHandle: SCSIOpenNamedDevice:752: handle=0x0x43064aecc6c0 (naa.600601603b303c009271dacce40de811 part 1) is read-only 2018-08-27T06:32:09.989Z cpu40:1001393041 opID=313e2b0b)LVM: ProbeDeviceInt:9585: Failed to detect if device <naa.600601603b303c009271dacce40de811:1> is a snapshot: Device does not contain a logical volume 2018-08-27T06:32:09.989Z cpu40:1001393041 opID=313e2b0b)LVM: InitDevice:10462: LVMProbeDevice failed on (3444138432, naa.600601603b303c009271dacce40de811:1): Device does not contain a logical volume 2018-08-27T06:32:09.989Z cpu40:1001393041 opID=313e2b0b)FSS: Create:2311: Failed to format LVM/VMFS on device 'naa.600601603b303c009271dacce40de811:1': Storage initiator error 2018-08-27T06:32:09.989Z cpu40:1001393041 opID=313e2b0b)FSS: Create:2373: Failed to create FS on dev [naa.600601603b303c009271dacce40de811:1]
The guest OS in VMs re-mount their filesystems in read-only mode or become unresponsive.
In this scenario the vmkernel.log will report FCPIO_DATA_CNT_MISMATCH .
Vmkernel.log will report message to below 2017-05-28T23:47:40.089Z cpu27:32809)<3>fnic : 2 :: hdr status = FCPIO_DATA_CNT_MISMATCH 2017-05-28T23:47:40.089Z cpu19:35295)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x28 (0x43a600ef10c0, 33236) to dev "naa.60060e80101e1970058be1670000002a" on path "vmhba2:C0:T1:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
If the device is claimed by PSP_RR default policy, path switching happens every 1000 IOs. Hence in case the current used path is marked as flaky, it will not switch to another working path until 1000 IOs succeed on the path.
Resolution
This issue is resolved in 6.7 U1 release . Refer to release notes