Some vSAN iSCSI LUNs might not be discovered on initiators after resolving vSAN network issues.
search cancel

Some vSAN iSCSI LUNs might not be discovered on initiators after resolving vSAN network issues.

book

Article ID: 326644

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:
When the vSAN network encounters issues, there might be iSCSI target owner transfer events that don't get handled cleanly between hosts. After the network issue is resolved, some of the iSCSI LUNs might not be discovered from the initiator side.

You see errors "Failed to add LUN" and "Failed to reload PR state from disk for LUN xxx (Lock was not free). Resulting in the LUNs will be inaccessible.

In vitd logs on new target owner host you see:
2021-05-29T04:15:24Z vitd[2501389]: VITD: Thread-0xc4630d7700 kernel_complete_open: Adding LUN 0 (iqn.1998-01.com.vmware:lnvs1009z32-ozecl156,L,0x0000000000000000) of target openstack-vol-6935b1a9-f3c2-46bc-95c8-4d4eac613dcb (mask:0xb)to kernel. <== Start to handle the opened LUN into kernel.
...
2021-05-29T04:15:47Z vitd[2501389]: VITD: Received invalid handles. DiskHandle: 0, FSS Handle: 18446744073709551615, User Space handle 0
2021-05-29T04:15:48Z vitd[2501389]: VITD: Thread-0xc4630d7700 LUN creation error: VitFSSBEIoctl_Create: LUN configuration error, see vmkernel.log for details
2021-05-29T04:15:48Z vitd[2501389]: VITD: Thread-0xc4630d7700 kernel_complete_open: Failed adding LUN 0(CTL_ID -1) of tgt openstack-vol-6935b1a9-f3c2-46bc-95c8-4d4eac613dcb to kernel, status: 0.: Success

In vmkernel on new target owner host you see:
2021-05-29T04:15:40.005Z cpu76:2501172)DLX: 4960: vol 'd4248060-883f-c15b-6966-1c34da7d75b2', lock at 122421248: [Req mode: 1] Not free:
2021-05-29T04:15:40.005Z cpu76:2501172)[type 10c00001 offset 122421248 v 2, hb offset 3944448
gen 3, mode 1, owner 5fd90df9-c50596c4-07b6-1c34da7d75be mtime 1400046
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
2021-05-29T04:15:40.005Z cpu76:2501172)WARNING: vit: VitPRInfoFileOpen:565: Open PR info file for LUN 0 failed with Lock was not free
2021-05-29T04:15:40.006Z cpu76:2501172)DLX: 4330: vol 'd4248060-883f-c15b-6966-1c34da7d75b2', lock at 122421248: [Req mode 1] Checking liveness:
2021-05-29T04:15:40.006Z cpu76:2501172)[type 10c00001 offset 122421248 v 2, hb offset 3944448

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vSAN 7.0.x

Cause

When vSAN network encounters issues, there might be iSCSI target owner transfer events that don't get handled cleanly between hosts. The temporary network issues might cause iSCSI targets and LUNs' objects to be inaccessible for a period of time. This might cause iSCSI service to fail to add the LUNs into the data service after a bunch of retries. So from the initiator side, LUNs will not be seen from a LUN list request.

Resolution

After the network comes back to normal,

1. Mark the LUNs that are missing offline.
2. Mark the LUNs as online.

Once these actions are completed the LUNs will come back to normal.

Procced to discover the LUNs form the initiator.