vSAN NVMe disk report read only critical warning
search cancel

vSAN NVMe disk report read only critical warning

book

Article ID: 417677

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • vSAN reports the NVMe disk healthy after going into read only state. 
  • Unable to unmount or remove the disk from the disk group.

 

Environment

VMware vSAN 8.x

Cause

This is a genuine hardware issue and the disk is marked read only at HW level. VSAN fetches the device smart stats and reports the disk as READ-ONLY. vSAN is limited to only reporting of such read only health warning for the NVMe disk.
 
  •  I/O started to timeout and PSA started to abort the I/O :
2025-09-26T00:19:20.795Z In(182) vmkernel: cpu4:2097865)NvmeDeviceIO: 1865: Start TSC for CmdSN 2a9f9c17 is 7053679477 ms
2025-09-26T00:19:20.795Z In(182) vmkernel: cpu4:2097865)NVMEPSA:1345 taskMgmt:abort cmdId.initiator=0x430a9de63c00 CmdSN 0x2a9f9c17 world:0 controller 258 state:5 nsid:1
2025-09-26T00:19:20.795Z In(182) vmkernel: cpu4:2097865)NVMEIO:3974 Ctlr 258, ns 1, tmReq 0x4316232c19c0, type 1, initiator 0x430a9de63c00, sn 0x2a9f9c17, world id 0.

 
  • NvME abort command itself also got stuck so the abort processing was escalated to controller reset:
2025-09-26T00:19:26.797Z In(182) vmkernel: cpu54:2098009)NVMEIO:4623 Ctlr 258, abort commands stuck, escalate to controller reset
2025-09-26T00:19:26.797Z In(182) vmkernel: cpu54:2098009)NVMEDEV:8245 Resetting controller 258 (nqn.2019-10.com.kioxia:##########:##########)
2025-09-26T00:19:26.797Z In(182) vmkernel: cpu54:2098009)NVMEDEV:8260 Controller 258 state changed from 5 to 8(INRESET)
2025-09-26T00:19:26.797Z Wa(180) vmkwarning: cpu8:2098010)WARNING: NVMEIO:4346 Controller 258 in state 8 or in recovery mode, bail out.

  • Controller admin queue reset event:
2025-09-26T00:19:26.807Z In(182) vmkernel: cpu56:2098009)NVMEDEV:7678 controller 258
2025-09-26T00:19:26.807Z In(182) vmkernel: cpu56:2098009)NVMEDEV:8007 Reset admin queue (controller 258)

  • When APD is sent and VIRT RESET issued to cleared any outstanding IOs:
2025-09-26T00:19:26.857Z Wa(180) vmkwarning: cpu66:2097836)WARNING: NvmeDeviceIO: 1737: Command 0x2 to device "eui.###############################" marked for PDL virtual reset completed with  abort/reset: cmdId.initiator=0x430a9de63c00 cmdId
 
  • LSOM event indicating the disk has gone offline:
2025-09-26T00:19:26.867Z Wa(180) vmkwarning: cpu72:2099387)WARNING: LSOM: LSOMEventNotify:9026: vSAN device 52f3a548-####-####-####-############ has gone offline.
 
  •  Device is not ready event:
2025-09-26T00:20:01.863Z Wa(180) vmkwarning: cpu91:29543019)WARNING: NvmeUtil: 151: Error on Cmd(0x45be9d295580) 0x82, CmdSN 0x1712f0 from world 0 to component "eui.#############################"  H:0x0 D:0x1 P:0x0
2025-09-26T00:20:01.863Z In(182) vmkernel: cpu4:2098720)NvmeDeviceIO: 4004: SED command failed for device eui.############################# with H:0x0 D:0x1 P:0x0
 
  • The device came online at this time:

    2025-09-26T00:20:02.891Z Wa(180) vmkwarning: cpu52:2097833)WARNING: PLOG: PLOGProcessHotpluggedDevice:10884: vSAN device 52f3a548-####-####-####-########### has come online.

 

  •  The NvME disk reported critical warning  indicating the disk has become read-only:
2025-09-26T00:25:57.565Z In(14) vobd[2097955]:  [vSANCorrelator] 7054076246846us: [vob.vsan.lsom.readonlynvmediskhealthcriticalwarning] NVMe critical health warning for disk eui.############################# is: The disk has become read-only.
 
  • The NVMe device returned status 0x182, indicating the controller rejected an attempted write to a region marked read-only (device-side read-only range). This is a device-level error reported by the NVMe controller/firmware:
 
2025-09-26T03:03:34.807Z Wa(180) vmkwarning: cpu14:2098283)WARNING: HPP: HppNvmeThrottleLogForDevice:600: NVMe Cmd 0x1 (0x45ba422db740, 2101445) to dev "eui.#############################" on path "vmhba4:C0:T0:L0" Failed:
2025-09-26T03:03:34.807Z Wa(180) vmkwarning: cpu14:2098283)WARNING: HPP: HppNvmeThrottleLogForDevice:608: Error status H:0x0 D:0x182 P:0x0 hppAction = 1
2025-09-26T03:03:34.807Z Wa(180) vmkwarning: cpu14:2098283)WARNING: NvmeUtil: 151: Error on Cmd(0x45ba422db740) 0x1, CmdSN 0x1 from world 2101445 to component "eui.#############################"  H:0x0 D:0x182 P:0x0
 
Status 0x182 (NVMe device error) means the device refused a write to a read-only range. This typically occurs when the drive firmware temporarily marks parts of the namespace read-only due to internal media errors, protection of failing blocks, or controller-level safeguards. Because the error originates at the device controller (H:0x0, D:0x182, P:0x0 in the VMkernel logs), it is not a host driver, path, or network issue — it is caused by the NVMe device/firmware itself. 
 

Resolution

Engage the Hardware vendor to perform a detailed investigation of the device’s read-only transition and health status and replace if recommended by them.