NVMe drive on vSAN cluster disk management may show unhealthy/failed
search cancel

NVMe drive on vSAN cluster disk management may show unhealthy/failed

book

Article ID: 415843

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

vSAN health may report unhealthy disk on vSAN diskgroup/diskpool.

You may see one or more NVMe drves in error state from the vCenter UI for vSAN disk management.

Environment

  • VMware vSAN 7.x
  • VMware vSAN 8.x
  • VMware vSAN 9.x

Cause

  • The issue would occur due to underlying hardware issue with NVMe.
  • This can be verified by checking the logs on ESXi host under /var/run/log/vmkernel.log and can look for "(state: 9 CONTROLLER_STATE_FAILED)"
  • Example log event as below:

vmkernel: cpu44:2097718)NVMEPSA:1345 taskMgmt:abort cmdId.initiator=0x4309ec4bc1c0 CmdSN 0x3b65426 world:0 controller 265 state:9 nsid:1 <== controller state is CONTROLLER_STATE_FAILED

Resolution

This is a hardware fault and needs investigation from hardware vendor. Please work with hardware vendor along with the log snippets observed in the vmkernel.log