During the storage controller fail-over, NVMe over TCP adapter fails to connect to all the target ports, few of them fails to connect.
vmkernel.log
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEIO:2233 Transport driver failed to submit command 0x45ba4fe29340, C: nqn.20xx-xx.com.purestorage:flasharray.2b5ef1330c7e1bc8#vmhba65#10.19.xx.xx:4420, Q: 0 <0xbad0001>.
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEPSA:217 Complete vmkNvmeCmd: 0x45ba4fe29340, vmkPsaCmd: 0x45baf9819e80, cmdId.initiator=0x4539fac9b9e8, CmdSN: 0x0, status: 0x80d
2024-10-07T18:47:32.933Z In(182) vmkernel: cpu19:2107540)nvmetcp:nt_CommandInt:6365 [ctlr 258, queue 0] cmd 0x45ba4fe29340, queue not connected: Failure
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEIO:2233 Transport driver failed to submit command 0x45ba4fe29340, C: nqn.2010-06.com.purestorage:flasharray.2b5ef1330c7e1bc8#vmhba65#10.xx.xx.69:4420, Q: 0 <0xbad0001>.
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEPSA:217 Complete vmkNvmeCmd: 0x45ba4fe29340, vmkPsaCmd: 0x45baf9819e80, cmdId.initiator=0x4539fac9b9e8, CmdSN: 0x0, status: 0x80d
2024-10-07T18:47:32.933Z In(182) vmkernel: cpu19:2107540)nvmetcp:nt_CommandInt:6365 [ctlr 258, queue 0] cmd 0x45ba4fe29340, queue not connected: Failure
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEIO:2233 Transport driver failed to submit command 0x45ba4fe29340, C: nqn.20xx-xx.com.purestorage:flasharray.2b5ef1330c7e1bc8#vmhba65#10.19.xx.xx:4420, Q: 0 <0xbad0001>.
2024-10-07T18:47:32.933Z Wa(180) vmkwarning: cpu19:2107540)WARNING: NVMEPSA:217 Complete vmkNvmeCmd: 0x45ba4fe29340, vmkPsaCmd: 0x45baf9819e80, cmdId.initiator=0x4539fac9b9e8, CmdSN: 0x0, status: 0x80d
2024-10-07T18:47:32.933Z In(182) vmkernel: cpu19:2107540)HPP: HppNvmeUpdateNamespaces:535: Marking paths dead - controller:nqn.20xx-xx.com.purestorage:flasharray.2b5ef1330c7e1bc8#vmhba65#10.19.xx.xx:4420
2024-10-07T18:47:32.933Z In(182) vmkernel: cpu19:2107540)nvmetcp:nt_CommandInt:6365 [ctlr 259, queue 0] cmd 0x45ba4fe29340, queue not connected: Failure
vSphere Esxi 7.x
vSphere Esxi 8.x
During controller reset, first thing we do is unconfigure IO queues and try to schedule helper requests which gets blocked as no helper worlds are available to service and the ones that are scheduled are blocked on the same semaphore that controller reset is holding.
This issue is fixed in Esxi 8.0 P06.
Reboot the affected hosts as a temporary fix.