Error: VM hangs on the host. No actions can be performed such as vMotion, power off etc. FRAME DROP event has been observed in vmkernel logs
search cancel

Error: VM hangs on the host. No actions can be performed such as vMotion, power off etc. FRAME DROP event has been observed in vmkernel logs

book

Article ID: 371539

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Latency seen to storage.

  • VMs are randomly impacted and VMs have outages or performance of the VMs is degraded. 

  • FC controller link showing down for esxi host
vmkernel.log reporting:

 [YYYY-MM-DDTHH:MM:SS] cpu35:2115960)WARNING: iodm: vmk_IodmEvent:194: vmhba1: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

 [YYYY-MM-DDTHH:MM:SS] cpu2:2098292)ScsiDeviceIO: 4124: Cmd(0x45b981710948) 0x28, CmdSN 0x22a2 from world 2100337 to dev "naa.62#############0001" failed H:0x7 D:0x28 P:0x0
 [YYYY-MM-DDTHH:MM:SS] cpu6:2098268)ScsiDeviceIO: 4124: Cmd(0x45b9816c0e48) 0x28, CmdSN 0x22ac from world 2100337 to dev "naa.624#############ad2" failed H:0x7 D:0x28 P:0x0

[YYYY-MM-DDTHH:MM:SS] cpu0:2098288)WARNING: ScsiDeviceIO: 1513: Device naa.624a93###########0001 performance has deteriorated. I/O latency increased from average value of 914 microseconds to 18407 microseconds.

[YYYY-MM-DDTHH:MM:SS] cpu52:2099716)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.6000###############2d" state in doubt; requested fast path state update...
[YYYY-MM-DDTHH:MM:SS] cpu52:2099716)ScsiDeviceIO: 4115: Cmd(0x45baf4f44648) 0x28, CmdSN 0xbd from world 11753952 to dev "naa.6000###############2d" failed H:0x2 D:0x0 P:0x0
[YYYY-MM-DDTHH:MM:SS] cpu49:2098870)qlnativefc: vmhba1(c1:0.0): (1:4) Dropped frame(s) detected (29696 of 65536 bytes).
[YYYY-MM-DDTHH:MM:SS] cpu49:2098870)qlnativefc: vmhba1(c1:0.0): C0:T1:L4 - FCP command status: 0x15-0x0 (0x2) portid=0###0 oxid=0x357 cdb=280000 len=65536 rspInfo=0x0 resid=0x0 fwResid=0x7400 host status = 0x2 device status = 0x0
[YYYY-MM-DDTHH:MM:SS] cpu8:2099721)NMP: nmp_ThrottleLogForDevice:3867: Cmd 0x28 (0x4########c8, 11763169) to dev "naa.60002##############2d" on path "vmhba1:C0:T1:L4" Failed:
[YYYY-MM-DDTHH:MM:SS] cpu20:2098883)WARNING: iodm: vmk_IodmEvent:194: vmhba1: FRAME DROP event has been observed 54 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

In Qlogic FC-HBA: 

 [YYYY-MM-DDTHH:MM:SS] In (182) vmkernel: cpu54:2098257) qlnativefc: vmhba2 (12:0.0): qlnativefcStatusEntry:1919: (7:41) Dropped frame (s) detected (106496 of 131072 bytes). 
 [YYYY-MM-DDTHH:MM:SS] In (182) vmkernel: cpu54:2098257) qlnativefc: vmhba2 (12:0.0): qlnativefcStatusEntry: 2067:C0:T7:L41 FCP command status: 0x15-0x0 (0x2) portid=bc0142 oxid=0x4ba c 80000 len-131072 rspInfo=0x0   resid=0x0 fwResid=0x1a000 host status = 0x2 device  

[YYYY-MM-DDTHH:MM:SS] In (182) vmkernel: cpu54:2098257) qlnativefc: vmhba2 (12:0.0): qlnativefcStatusEntry:1919: (7:41) Dropped frame (s) detected (116736 of 131072 bytes).
 [YYYY-MM-DDTHH:MM:SS] In (182) vmkernel: cpu54:2098257) qlnativefc: vmhba2 (12:0.0): qlnativefcStatusEntry:2067:C0:T7:L41 FCP command status: 0x15-0x0 (0x2) portid=bc0142 oxid=0x4bc c 80000 len-131072 rspInfo=0x0 resid=0x0  fwResid=0x1c800 host status = 0x2 device st$

qlnativefc: vmhba1(c1:0.0): (1:4) Dropped frame(s) detected
The ESXi host is experiencing Fibre Channel transport issues (dropped frames, failed commands) to device naa.xxxxxxxxxxxxxxxxxxxxx.
This is leading to NMP path probing, SCSI command failures, and potential I/O impact on associated datastores or VMs.

  • There may be "Dropped frames" when running the command below:
esxcli storage san fc events get
 
Example of output:
 
[YYYY-MM-DDTHH:MM:SS] [vmhbaX] Dropped Frames (262144 of 165 bytes) on C0:T0:L0 cmd:0x8a
[YYYY-MM-DDTHH:MM:SS] [vmhbaX] Dropped Frames (131072 of 165 bytes) on C0:T0:L0 cmd:0x8a
  • There may be  CRC errors reported  when running the command below:
esxcli storage san fc stats get

 Adapter: vmhba1
   Tx Frames: 6167181
   Rx Frames: 170996477
   Lip Count: 0
   Error Frames: 0
   Dumped Frames: 0
   Link Failure Count: 1
   Loss of Signal Count: 0
   PrimSeq Protocol Err Count: 0
   Invalid Tx Word Count: 335988552
   Invalid CRC Count: 10620
   Input Requests: 2302279
   Output Requests: 19885
   Control Requests: 399796
 

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

As message states this indicates some problem on the physical fiber channel link ( Fiber-optic cable) or the SFP on FC-Switch connected to this host.

FRAME DROP events suggests a possible connectivity issue or congestion. An erroneous FC Port can also lead to this issue.

Host Status [H:0x7] :- This status is returned when a device has been reset due to a Storage Initiator Error. This typically occurs due to an outdated HBA firmware or possibly (though rarely) as the result of a bad HBA.

Dropped frames and CRC errors are critical Fibre Channel issues that typically stem from faulty cables, SFPs, switch port instability, or degraded transceivers. These conditions can cause I/O disruptions, path flapping, or degraded VM performance.

Resolution

Engage with the fabric/hardware/storage vendor to investigate and mitigate the dropped frames in the environment.