Symptoms:
ESXi installation and datastore creation may fail on an NVMe SSD which supports compare and write fused operation.
ESXi shows error:
“Fail to create VMFS on device”
In kernel log, there are NVMe compare command (opcode 0x5) and write command (opcode 0x1) error messages similar to:
2023-01-19T14:41:53.218Z cpu12:2098924)WARNING: NVMEIO:2624 command 0x45b9c5dfe940 failed: ctlr 256, queue 1, psaCmd 0x45b9c23f9148, status 0xa, > opc 0x5, cid 373, nsid 1
2023-01-19T14:41:53.218Z cpu12:2098924)WARNING: NVMEIO:2624 command 0x45b9c5dfeb00 failed: ctlr 256, queue 1, psaCmd 0x45d9d0d22688, status 0xa, > opc 0x1, cid 374, nsid 1
Description:
ESXi VMFS datastore uses the atomic test and set locking mechanism (ATS-only) on an NVMe device which supports fused operation. The NVMe FUSE support is indicated in FUSES field in the Identify Controller data structure, which can be obtained via command:
esxcli nvme controller identify -c <controller_name>
When installing ESXi or creating a VMFS datastore on an NVMe device which supports fused operation, the native NVMe driver needs to handle a Compare and Write fused operation. According to NVMe spec, the Compare and Write commands must be inserted next to each other in the same Submission Queue, and the Submission Queue Tail doorbell pointer update must indicate both commands as part of one doorbell update. The native NVMe driver puts the two commands together in one Submission Queue, but writes the doorbell for each command separately. As a result, the device firmware might complete the fused commands with an error and fail to create a VMFS datastore. Since creating a VMFS datastore on a device is a prerequisite for successful ESXi installation, you might not be able to install ESXi on such NVMe devices.