QLogic's qedf Driver receives frequent RJT (Rejected) responses for SCSI Abort commands
search cancel

QLogic's qedf Driver receives frequent RJT (Rejected) responses for SCSI Abort commands

book

Article ID: 391740

calendar_today

Updated On:

Products

VMware vSphere ESXi 7.0 VMware vSphere ESXi 8.0

Issue/Introduction

An Administrator may observe SCSI commands failures with a Host status of H:0x7 while reviewing the /var/log/vmkernel.log:

#######T18:34:17.715Z cpu0:#######)qedf:vmhba65:qedfc_cmd_state_handler:2132:Info: Returning: Success for oxid = 0x7b7
#######T18:34:17.715Z cpu113:#######)qedf:vmhba65:qedfc_fp_process_cqes:3527:Info: dummy cqe. xid: 0x7b7
#######T18:34:17.715Z cpu113:#######)qedf:vmhba65:qedfc_fp_process_cqes:3514:Info: Abort cqe. xid: 0x7b7
#######T18:34:17.715Z cpu113:#######)qedf:vmhba65:qedfc_process_abts_compl:1988:Info: ABTS response - RJT
#######T18:34:17.715Z cpu113:#######)qedf:vmhba65:qedfc_process_abts_compl:2044:Info: (6:2): Completing cmd with Host Error status (0x7), xid=0x7b7, SN=80000004, worldId=3ed719, refcnt=4 lba=0x1dbbff960 lbc=0x380 cmd 8a:0:0:0:0
#######T18:34:17.715Z cpu106:#######)NMP: nmp_ThrottleLogForDevice:3867: Cmd 0x8a (0x45e9df2a1f08, 4118297) to dev "eui.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" on path "vmhba65:C0:T6:L2" Failed:
#######T18:34:17.715Z cpu106:#######)NMP: nmp_ThrottleLogForDevice:3875: H:0x7 D:0x0 P:0x0 . Act:EVAL. cmdId.initiator=0x430dfa4a25c0 CmdSN 0x80000004
#######T18:34:17.715Z cpu106:#######)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "eui.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" state in doubt; requested fast path state update...
#######T18:34:17.715Z cpu106:#######)ScsiDeviceIO: 4176: Cmd(0x45e9df2a1f08) 0x8a, CmdSN 0x80000004 from world 4118297 to dev "eui.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" failed H:0x7 D:0x0 P:0x0

Upon closer inspection, the log messages from the qedf driver is stating that the reason for the SCSI command failure was due to the command being rejected by the storage array:

2025-03-19T18:34:17.715Z cpu113:36285448)qedf:vmhba65:qedfc_process_abts_compl:1988:Info: ABTS response - RJT <-- SCSI ABORT is being rejected by the storage array 

 

Environment

VMware vSphere ESXi7.x

VMware vSphere ESXi8.x

Cause

When a Storage Array rejects an ABORT command, which is Task Management Function (TMF), it typically does this because the command that it attempted to abort never successfully made it to the storage array, meaning that it was dropped somewhere between the initiator and target. The array cannot abort a command it never received, which is why it rejects the abort request.

Resolution

There are a few reasons why a RJT or REJECTED status is returned to the initiator but all of them are a going to be layer 1 or physical issue:

  • Bad SFP
  • Low Light Level on switch port
  • Auto-negotiated switch port instead of forced/statically set link speed

A helpful troubleshooting step to isolate the issue is to determine if these rejections are occurring on a single HBA or path, which can narrow down which storage fabric needs to be reviewed:

$ cat /var/log/vmkernel.log |grep RJT|grep vmhba65|wc -l
4332
$ cat /var/log/vmkernel.log |grep RJT|grep vmhba64|wc -l
0