Hardware Compatibility Validation failed during VCF BRINGUP - Unable to validate vSAN ESA disk
search cancel

Hardware Compatibility Validation failed during VCF BRINGUP - Unable to validate vSAN ESA disk

book

Article ID: 409336

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

  • During Greenfield deployment of VMware Cloud Foundation (VCF) 5.2 using CloudBuilder, validation failure occurs when checking the compatibility of vSAN (ESA) disks.

  • Despite the disks being explicitly listed as compatible in Broadcom vSAN compatibility guide and included in the most recent HCL JSON file, deployment halts with a compatibility validation error.

  • Log file located at /var/log/vmware/vcf/bringup/vcf-bringup-debug.log reveals repeated debug entries similar to:

    YYYY-MM-DDTHH:MM:SS [bringup,...] DEBUG [c.v.e.s.c.v.util.ResponseUtil,...] Build validation response: {"errorCode":"VSAN_ESA_DISK_AVAILABILITY_VALIDATION.error","message":"vSAN ESA Disk Availability Validation Failed"}

    This error message indicates a failure in the vSAN ESA Disk Availability Validation stage of the bring-up process.

  • To diagnose further, command esxcli nvme adapter list is used to verify storage adapters currently recognized by the ESXi hosts. For instance:

    Adapter  Adapter Qualified Name                                                                                                           Transport Type  Driver     Associated Devices

    -------  -------------------------------------------------------------------------------------------------------------------------------  --------------  ---------  ------------------

    vmhba1   aqn:nvme_pcie:nqn.2018-07.com.marvell:nvme:nvm-subsystem-<device_serial_number>      -mn-HPE NS204i-u Gen11 Boot Controller      -16  PCIe            nvme_pcie

    This output shows vmhba1 functioning as the NVMe adapter managing NVMe devices such as the HPE NS204i-u Gen11 Boot Controller.

  • However, upon listing all SCSI devices attached to the ESXi host with esxcfg-scsidevs -A, it becomes evident that all NVMe disks are incorrectly claimed by vmhba0 as opposed to vmhba1,

    Additionally, to review list all storage devices, run esxcli storage core device list,

    vmhba0      t10.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba0      eui.<unique device identifier>
    vmhba1      t10.NVMe____HPE_NS204i2Du_Gen11_Boot_Controller______<device_serial_number>

  • This misallocation is critical because the NVMe disks, which should be claimed by the NVMe adapter vmhba1, are instead managed by a different SCSI adapter vmhba0.

Environment

VCF 5.2.x

Cause

NVMe devices are attached to "Tri-mode controllers," a configuration not supported by vSAN ESA.

Resolution

  • NVMe devices are only supported directly connected to a PCIe slot on the bus. 

  • NVMe devices attached to Tri-mode controller are NOT a vSAN supported configuration. This applies to storage controllers in RAID, HBA, Pass-through or JBOD mode.

Note: Beyond being unsupported, this has been shown to cause performance problems, especially in configurations that limit NVMe drives to a single PCI-Express lane. While some Ready Nodes have used PCI-Express switches or expansion devices, these are commonly not needed on the most modern of CPU architectures today.

Refer: vSAN Design Guide

Additional Information