On a metro cluster storage array configuration, the active paths for ESXi hosts to LUNs are switched from one storage array to the other to facilitate storage array upgrade.
VMs stall, due to I/O failures.
VMware vSphere (all versions).
Switching of the active paths from one array to the other does not complete correctly.
The hosts then are attempting to write to storage ports that are blocked/down.
This is evidenced in /var/log/vmkernel.log by I/O failed due to
1) storage level aborts:
e.g.vmkernel: cpu6:2098436)ScsiDeviceIO: 4633: Cmd(0x45baa002fb40) 0x8a, CmdSN 0x3b4 from world 4956517 to dev "naa.######################" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xb 0x4b 0xd1
2) device write protected:
e.gvmkernel: cpu6:2098436)ScsiDeviceIO: 4686: Cmd(0x45ba9ff35640) 0x2a, CmdSN 0x37f from world 4243126 to dev "naa.######################" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x7 0x27
I/O may also initially fail with device busy (D:0x8) returned by the array:vmkernel: cpu1:2098431)ScsiDeviceIO: 4633: Cmd(0x45baa00e5940) 0x8a, CmdSN 0x361 from world 4956517 to dev "naa..######################"" failed H:0x0 D:0x8 P:0x0
Investigate at the array level why path switch over did not complete correctly.