ESXi host with Cisco HBA driver fails with the error: "FCPIO_DATA_CNT_MISMATCH"
search cancel

ESXi host with Cisco HBA driver fails with the error: "FCPIO_DATA_CNT_MISMATCH"

book

Article ID: 340039

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere ESXi 7.0 VMware vSphere ESXi 8.0

Issue/Introduction

  • ESXi host with Cisco VIC HBA is in unresponsive state.
  • hostd process is unresponsive.
  • LUNs are disconnected and fails to boot using FC/FCoE.
  • VM replication fails while using external storage replication tool.
  • Intermittently VMs hang with black screen. 
  • In the affected ESXi host file /var/run/log/vmkwarning.log you see errors similar to below:
    YYYY-MM-DDThh:mm:ss.msZ cpu28:33310)WARNING: LinScsi: SCSILinuxAbortCommands:1890: Failed, Driver fnic, for vmhba4
    YYYY-MM-DDThh:mm:ss.msZ cpu27:47509)WARNING: LinScsi: SCSILinuxProcessCompletions:826: Error BytesXferred > Requested Length Marking transfer length as 0 - vmhba = vmhba4, Driver Name = fnic, Requested length = 512, Resid = ######
    YYYY-MM-DDThh:mm:ss.msZ cpu27:47509)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.##############" state in doubt; requested fast path state update...
    YYYY-MM-DDThh:mm:ss.msZ cpu17:32785)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.##############" state in doubt; requested fast path state update...
    YYYY-MM-DDThh:mm:ss.msZ cpu3:40875)WARNING: VMotion: 1884: 1489551966631519 S: Waited 119.995 seconds for the monitor to process a preCopyNext action. This may cause unexpected vMotion failures.
    YYYY-MM-DDThh:mm:ss.msZ cpu25:33529)WARNING: LinScsi: SCSILinuxProcessCompletions:826: Error BytesXferred > Requested Length Marking transfer length as 0 - vmhba = vmhba4, Driver Name = fnic, Requested length = 4096, Resid = #######
    YYYY-MM-DDThh:mm:ss.msZ cpu25:33529)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.##############" state in doubt; requested fast path state update...
    YYYY-MM-DDThh:mm:ss.msZ cpu46:33310)WARNING: LinScsi: SCSILinuxAbortCommands:1890: Failed, Driver fnic, for vmhba4
    YYYY-MM-DDThh:mm:ss.msZ Wa(180) vmkwarning: cpu63:2103150)WARNING: nfnic: <#>: fnic_fcpio_icmnd_cmpl_handler: 1964: sc: 0x45da6044e2c0 tag: 0x69a hdr status: FCPIO_DATA_CNT_MISMATCH IO failure!
    YYYY-MM-DDThh:mm:ss.msZ Wa(180) vmkwarning: cpu53:2097877)WARNING: nfnic: <#>: fnic_fcpio_icmnd_cmpl_handler: 1964: sc: 0x45da612f9440 tag: 0x646 hdr status: FCPIO_DATA_CNT_MISMATCH IO failure!
  • In the affected ESXi host /var/run/log/vmkernel.log file, you see errors similar to below:
    YYYY-MM-DDThh:mm:ss.msZ cpu32:33284)<#>fnic : 4 :: hdr status = FCPIO_DATA_CNT_MISMATCH
    YYYY-MM-DDThh:mm:ss.msZ cpu24:33308)WARNING: LinScsi: SCSILinuxProcessCompletions:826: Error BytesXferred > Requested Length Marking transfer length as 0 - vmhba = vmhba4, Driver Name = fnic, Requested length = 512, Resid = ######
    YYYY-MM-DDThh:mm:ss.msZ cpu24:33308)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x2a (0x439e58573580, 32830) to dev "naa.##############" on path "vmhba4:C#:T#:L##" Failed: H:0x7 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
    YYYY-MM-DDThh:mm:ss.msZ cpu24:33308)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.##############" state in doubt; requested fast path state update...
    YYYY-MM-DDThh:mm:ss.msZ cpu24:33308)ScsiDeviceIO: 2613: Cmd(0x439e58573580) 0x2a, CmdSN 0x677 from world 32830 to dev "naa.##############" failed H:0x7 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.
    YYYY-MM-DDThh:mm:ss.msZ cpu32:33284)<#>fnic : 4 :: hdr status = FCPIO_DATA_CNT_MISMATCH
  • Backup software API timeout errors (HTTP curl timeouts)
  • Third-party backup failures with fallback to full backups
  • Cohesity backup operations interrupted with incremental snapshot chain breaks
  • Backup job completion with warnings rather than complete failures
  • vSphere API call timeouts during snapshot operations
  • Extended backup windows due to forced full backups

Environment

  • VMware vSphere ESXi 7.x
  • VMware vSphere ESXi 8.x

Cause

The Cisco NFNIC driver will print this message when not all of the data expected is transmitted. This typically indicates that out of order frames were received from the array target, which should not happen and are indicative of:

Resolution

This is not a Broadcom VMware issue. Since the FCPIO_DATA_CNT_MISMATCH error is typically associated with a physical layer issue (bad SFP, low light levels, etc), review both the fabric switches and the Cisco Fabric Interconnect (FI) for transmit errors.

Contact Cisco support for further assistance (Note: This issue was fixed in 3.1(2b), which released in the Fall of 2016)