A VCF Administrator observed temporary storage Path Redundancy alarms occasionally on Cisco UCS hosts using the NFNIC driver version prior to 5.0.0.46:
2025-08-22T17:24:52.047Z In(14) vobd[2097959]: [scsiCorrelator] 2256264224099us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba3:C0:T22:L254 changed state from dead
2025-08-22T17:24:52.048Z In(14) vobd[2097959]: [scsiCorrelator] 2256264224111us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba3:C0:T14:L254 changed state from on (device ID: naa.######################################)
2025-08-22T17:24:52.049Z In(14) vobd[2097959]: [scsiCorrelator] 2256500366829us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.###################################### degraded. Path vmhba3:C0:T14:L254 is down. Affected datastores: "Datastore-254".
2025-08-22T17:24:52.050Z In(14) vobd[2097959]: [scsiCorrelator] 2256264226989us: [vob.scsi.scsipath.pathstate.on] scsiPath vmhba3:C0:T22:L253 changed state from dead
2025-08-22T17:24:52.051Z In(14) vobd[2097959]: [scsiCorrelator] 2256264226999us: [vob.scsi.scsipath.pathstate.deadver2] scsiPath vmhba3:C0:T14:L253 changed state from on (device ID: naa.######################################)
2025-08-22T17:24:52.052Z In(14) vobd[2097959]: [scsiCorrelator] 2256500369418us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.###################################### degraded. Path vmhba3:C0:T14:L253 is down. Affected datastores: "Datastore-253".
ESXi (All versions)
Cisco NFNIC driver version 5.0.0.45 and older
There is a race condition/edge case where the Cisco NFNIC driver will attempt to abort a command that it hasnt even received yet. When the abort for that command reaches certain storage array vendor make/models (Pure, Infinidat, etc) and the array never received the original command it is trying to abort, the array will reject that abort and return a FCPIO_ITMF_REJECTED status:
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu54:2101289)VSCSI: 3518: handle 9024967534457021(GID:12477)(vscsi0:0):Added handle (refCnt = 3) to vscsiResetHandleList vscsiResetHandleCount = 1
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu46:2097827)VSCSI: 3772: handle 9024967534457021(GID:12477)(vscsi0:0):processing reset for handle ... state 1381192707
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu46:2097827)nfnic: <4>: INFO: fnic_abort_cmd: 3864: Abort cmd called for Tag: 0x302 issued time: 0 ms CMD_STATE: FNIC_IOREQ_CMD_PENDING CDB Opcode: 0x2a sc:0x45d9c3372980 flags: 0x3 lun: 251 target: 0x12540
2025-08-22T17:24:52.026Z Wa(180) vmkwarning: cpu46:2097827)WARNING: nfnic: <4>: fnic_abort_cmd: 3878: Abort for cmd tag: 0x302 in pending state
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu46:2097827)nfnic: <4>: INFO: fnic_taskMgmt: 2229: TaskMgmt: virt reset for CmdInitiator: 0x430e2590b0c0 Aborted :1 cmds
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu52:2099288)nfnic: <4>: INFO: fnic_fcpio_icmnd_cmpl_handler: 1865: io_req: 0x45d9a721a370 sc: 0x45d9c3372980 tag: 0x302 CMD_FLAGS: 0x53 CMD_STATE: FNIC_IOREQ_ABTS_PENDING ABTS pending hdr status: FCPIO_ABORTED scsi_status: 0x0$
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu52:2099288)nfnic: <4>: INFO: fnic_fcpio_itmf_cmpl_handler: 2385: fcpio hdr status: FCPIO_ITMF_REJECTED
Notice that the "issued time" for the command is 0ms, meaning the command the Cisco NFNIC driver is trying to abort wouldn't have made it out to the storage array to be able to abort in the first place.
When the Cisco NFNIC driver receives the REJECT status, it will proactively tear down the Fabric session and reconnect as an error handling situation, which results in a storage path loss/recovery:
2025-08-22T17:24:52.026Z Wa(180) vmkwarning: cpu52:2099288)WARNING: nfnic: <4>: fnic_fcpio_itmf_cmpl_handler: 2417: abort reject received id: 0x302
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu52:2099288)nfnic: <4>: INFO: fnic_handle_itmf_reject: 2271: Abort Rejected ! sending TGT_EV_LOGOUT for 0x12540
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu8:2098200)nfnic: <4>: INFO: fnic_tport_event_handler: 2100: logging out from tport: 14 tport->fcid: 0x12540
2025-08-22T17:24:52.026Z In(182) vmkernel: cpu8:2098200)nfnic: <4>: INFO: fdls_tgt_logout: 1533: Sending logo to tid: 0x12540
Cisco has altered its NFNIC driver behavior in 5.0.0.46 and beyond so that it will no longer perform a Fabric Logout (LOGO) and re-login when receiving a FCPIO_ITMF_REJECTED status for an abort command that was aborted as part of a virtual reset. Upgrading to 5.0.0.46 and beyond should eliminate the path redundancy events when the array rejects an abort:
Bug: CSCwn45550
Description: When a driver receives FCPIO_ITNF_REJECT in ESX virtual reset, it does LOGO. This issue is now resolved in the NFNIC driver version 5.0.0.46.
Link: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/release/notes/VIC/6-0/b-release-notes-for-cisco-ucs-vic-drivers-rel-6-0.html