VMware vSphere ESXi 8
VMware vSphere ESX 9
Dropped frames cause high VM latency.
Frame drops can lead to large levels of disruption in a Fibre Channel environment as we can see that commands are dropped, aborted and have to be re-issued.
Dropped frames are a critical issue in Fibre Channel fabrics, and they are almost always caused by a hardware problem anywhere along the connection path.
Below are some of the common reasons why an environment may encounter dropped frames
In VMware ESXi environments, dropped frames on the storage fabric are recorded in the host's /var/log/vmkernel.log file.
Because different hardware utilizes different drivers, the exact format of these warning messages will vary depending on the storage driver in use.
Below are examples of how dropped frame events are reported across common generic and vendor-specific drivers.
I/O Device Management (IODM)
The generic IODM layer will log a warning when it observes multiple frame drop events within a short timeframe.####-##-##T##:##:##.###Z Wa(180) vmkwarning: cpu21:2098060)WARNING: iodm: vmk_IodmEvent:191: vmhba#: FRAME DROP event has been observed 20 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.
QLogic Native Fibre Channel Driver (qlnativefc)
####-##-##T##:##:##.###Z In(182) vmkernel: cpu128:2098242)qlnativefc: vmhba#(28:0.0): qlnativefcStatusEntry:1927:(#:#) Dropped frame(s) detected (266240 of 262144 bytes).####-##-##T##:##:##.###Z In(182) vmkernel: cpu128:2098242)qlnativefc: vmhba#(28:0.0): qlnativefcStatusEntry:2076:C0:T#:L# - FCP command status: 0x15-0x202 (0x7) portid=###### oxid=###### cdb=2a0075 len=262144 rspInfo=0x0 resid=0x0 fwResid=0x41000 host status = 0x7 device $####-##-##T##:##:##.###Z In(182) vmkernel: cpu128:2098242)qlnativefc: vmhba#(28:0.0): qlnativefcStatusEntry:1927:(#:#) Dropped frame(s) detected (266240 of 262144 bytes).
Emulex LightPulse Fibre Channel Driver (lpfc)####-##-##T##:##:##.###Z In(182) vmkernel: cpu3:2097907)lpfc: lpfc_rportStats:4871: 1:(0) Compression log for fcp target 0, path is ok, FRAME: drops=345, under=0, over=0####-##-##T##:##:##.###Z In(182) vmkernel: cpu5:2097907)lpfc: lpfc_rportStats:4871: 1:(0) Compression log for fcp target 0, path is ok, FRAME: drops=349, under=0, over=0####-##-##T##:##:##.###Z In(182) vmkernel: cpu26:2097907)lpfc: lpfc_rportStats:4871: 1:(0) Compression log for fcp target 0, path is ok, FRAME: drops=350, under=0, over=0
localcli storage san fc events getFC Event Log------------####-##-## ##:##:##.### [vmhba1] LINK UP####-##-## ##:##:##.### [vmhba2] Dropped frames (57344 of 805 bytes) on C0:T0:L9 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (106496 of 805 bytes) on C0:T1:L9 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (86016 of 805 bytes) on C0:T0:L9 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (102400 of 805 bytes) on C0:T1:L9 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (96256 of 805 bytes) on C0:T0:L9 cmd:0x28
####-##-## ##:##:##.### [vmhba2] Dropped frames (35392 of 805 bytes) on C0:T0:L3 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (91648 of 805 bytes) on C0:T1:L3 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (29440 of 805 bytes) on C0:T0:L3 cmd:0x28####-##-## ##:##:##.### [vmhba2] Dropped frames (19008 of 805 bytes) on C0:T1:L3 cmd:0x28
# To isolate which adapter is reporting dropped frames
grep Dropped /var/log/vmkernel.log | awk '{print $3}' sort uniq -c 352 vmhba# (##:#.#):
Once the environment has been stabilized, thoroughly review the HBA, the fabric, and the backend storage array to identify the source of the dropped frames.
After the hardware issue is remedied, if actioned reconfigure the connection to the datastore to ensure it has redundant paths, mitigating future single points of failure.
Immediate Stabilization & Troubleshooting Steps:
Disable Affected HBA Paths (If dropped frames are isolated to a single HBA):
If redundancy is available and the issue is isolated to a single Host Bus Adapter (HBA), you may disable the paths that utilize it.
localcli storage core path list | grep "Runtime Name:" | grep vmhba# | awk '{print $3}' | while read line; do localcli storage core path set --path $line --state off; donelocalcli storage core path set -p <vmhba#:C#:T#:L#> --state offOnce the environment has been stabilized, thoroughly review the HBA, the fabric, and the backend storage array to identify the source of the dropped frames.
After the hardware issue is remedied, reconfigure the connection to the datastore to ensure it has redundant paths, mitigating future single points of failure.