vSAN disk goes offline frequently and vSAN disk error is seen in vSAN Skyline Health
search cancel

vSAN disk goes offline frequently and vSAN disk error is seen in vSAN Skyline Health

book

Article ID: 395246

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • The following error event is seen on the vSphere Client "Lost connectivity to storage device naa.################. Path vmhba#:C#:T#:L# is down."
    Where "naa.################" is a vSAN storage device.

  • The vSAN disk group appears to be unhealthy:

Environment

VMware vSAN 7.x

Cause

  • vmkernel logs on the ESXi host with the affected disk display IO errors for the disk and no connect SCSI code:

    YYYY-MM-DDTHH:MM:SS.SSSZ cpu24:2101735)WARNING: ScsiDeviceIQ: 12146: READ CAPACITY on device "naa.################" from Plugin "HPP" failed. I/0 errorYYYY-MM-DDTHH:MM:SS.SSSZ cpu47:2097290)ScsiDeviceIO: 4167: Cmd(0x45c940f52dc8) 0x25, CmdSN 0x47c2d8 from world 0 to dev "naa.################" failed H:0x1 D:0x0 P:0x0
  • Due to this NO_CONNECT (H:0x1) SCSI code, the error for lost connectivity is seen.

  • The NO_CONNECT error is seen if the disk has either been removed or if it has any issues.

  • The SCSI sense code "0xb 0x44" from vmkernel log for the faulty device indicates an "Internal Target Failure":

    YYYY-MM-DDTHH:MM:SS.SSSZ cpu41:2098370)ScsiDeviceIO: 4115: Cmd(0x45c940e989c8) 0x28, CmdSN 0x969 from world 0 to dev "naa.################" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xb 0x44 0x0

  • These indicates a hardware issue with the physical disk.

Resolution

To resolve this issue:

Additional Information