ESXi host loses access to FC storage following upgrade of ESXi host.
search cancel

ESXi host loses access to FC storage following upgrade of ESXi host.

book

Article ID: 426237

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • An ESXi host is upgraded/patched.

  • On reboot to complete the upgrade/patch, the ESXi host no longer detects the FC LUNs mapped to it and datastores are not accessible.

  • On subsequent reboot, the issue may be resolved.

Environment

VMware vSphere ESXi (all versions)

Cause

This may arise if the appropriate driver claims the FC HBA, but storage scan fails to detect the remote storage ports.

For example, in the following sample logging, the qedf driver claims the HBA:

/var/log/vmkernel.log:
vmkernel: cpu89:2098150)Loading module qedf ...
vmkernel: cpu89:2098150)Mod: 4852: Initialization of qedf succeeded with module ID 61.
vmkernel: cpu89:2098150)qedf loaded successfully.
...
vmkernel: cpu67:2098190)qedf:(26:0.2):qedfc_link_update_handler:1958:Info: ST(LINK): LINK_UP<-
vmkernel: cpu89:2098150)qedf:(26:0.2):qedfc_attachDevice:5030:Info: ST(FUNC): READY
...
vmkernel: cpu89:2098150)Device: 362: qedf:driver->ops.attachDevice :448 ms
vmkernel: cpu89:2098150)Device: 368: Found driver qedf for device 0x################
vmkernel: cpu89:2098150)QEDFC: Inside qedfc_startDevice: 5731 function
vmkernel: cpu89:2098150)qedf:(26:0.2):qedfc_startDevice:5746:Info: exit
vmkernel: cpu89:2098150)Device: 637: qedf:driver->ops.startDevice:0 ms
vmkernel: cpu89:2098150)qedf:(26:0.2):qedfc_scanDevice:5277:Info: entered
vmkernel: cpu89:2098150)qedf:(26:0.2):qedfc_scanDevice:5292:Info: Scan Start
vmkernel: cpu89:2098150)qedf:(26:0.2):qedfc_scanDevice:5382:Info: exit
vmkernel: cpu89:2098150)Device: 459: qedf:driver->ops.scanDevice:0 ms

However, no further logging is recorded indicating that storage ports are detected:


Sample expected logging:

/var/log/vmkernel.log:
vmkwarning: cpu43:2098359)WARNING: qedf:vmhba65:qedfc_rport_event_handler:1213: Setting dev_loss_timeout value to 20 seconds

...
vmkernel: cpu43:2098359)qedf:vmhba65:qedfc_queue_scsi_scan:4083:Info: C_ID[0x0]:P_ID[0x160180]:T_ID[0]
vmkernel: cpu43:2098359)qedf:vmhba65:qedfc_rport_event_handler:1247:Info: num_ofld_sess(func) = 1, num_ofld_sess(shost) = 1
vmkernel: cpu43:2098359)qedf:vmhba65:qedfc_alloc_conn_id:855:Info: INACTIVE, C_ID[0x1]:P_ID[0xc0000]:T_ID[1]
vmkwarning: cpu43:2098359)WARNING: qedf:vmhba65:qedfc_rport_event_handler:1213: Setting dev_loss_timeout value to 20 seconds
vmkernel: cpu43:2098359)qedf:vmhba65:qedfc_offload_connection:960:Info: Offloading connection host_no=1, portid=0c0000


(The above is sample logging only. Logging will vary depending on the driver and the underlying specific cause of the issue.) 

Resolution

  • Confirm that the HBA driver and firmware is supported as per the Broadcom HCL and is at the levels recommended by the hardware vendor. Update as required.

  • Investigate and remeidiate on the storage/fabric for any temporary issues.

  • If the above does not resolve the issue, please open a case with Broadcom support for further investigation.