Applications hosted within VMs may sporadically freeze or become unresponsive.
search cancel

Applications hosted within VMs may sporadically freeze or become unresponsive.

book

Article ID: 399717

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Certain application on certain virtual machines are reporting to be unresponsive.
  • The storage in use is Fibre Channel.
  • During the time of the event, aborts similar to the following are observed in ESXi

    /var/log/vmkernel.log shows

YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:qlnativefcEhVirtualReset: aborting sp ############## handle 67d from RISC. serialNumber=################, Command timeout=57391 sec
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:qlnativefcEhVirtualReset: abortCommand mbx success.
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:C0:T1:L216: Virtual Abort succeeded -- ####### (1)
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:C0:T0:L216: VIRTUAL RESET ISSUED.
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:C0:T0:L216: Virtual Abort succeeded -- ####### (0)
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:C0:T1:L216: VIRTUAL RESET ISSUED.
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:Command aborted on target=0x02x, lun=0xd8 - SCSI command timeout counter incremented to 4876
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:qlnativefcEhVirtualReset: aborting sp ############## handle 393 from RISC. serialNumber=########, Command timeout=57368 sec
YYYY-MM-DDThh:mm:ss.###Z cpu21:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:qlnativefcEhVirtualReset: abortCommand mbx success.
YYYY-MM-DDThh:mm:ss.###Z cpu20:#######)qlnativefc: vmhba#(3b:0.0): qlnativefcEhVirtualReset:####:C0:T1:L216: Virtual Abort succeeded -- ####### (1)

Environment

  • ESXi 7.x
  • ESXi 8.x
  • ESX 9.x

Cause

The logs indicate that aborts are the result of command timeouts. I/O is being issued by the driver but is failing to complete, or the driver is not receiving notification of I/O completion.

Resolution

  1. Verify that the vmhba driver/firmware is supported as per Broadcom Compatibility Guide
  2. Investigate the fabric and SAN layers to determine the cause of the command timeouts with the fabric and SAN vendors as necessary.