Unable to configure NVMe over TCP storage for New Site
search cancel

Unable to configure NVMe over TCP storage for New Site

book

Article ID: 428442

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

You see the following error when attempting to add storage controller.

The task "Discover NVMe over Fabrics controllers" fails with the error "An error occurred during host configuration. Operation failed, diagnostics report: Unable to discover"

 

Environment

ESXi 7.x

ESXi 8.X

ESX 9.x

Cause

Network configuration preventing the host from connecting to the controllers of the array.
  • ESXi hosts are unable to vmkping the NVMe-oF (TCP or RDMA) controller IP addresses on a storage array.

  • Connectivity fails with both standard (1500) and jumbo (9000) frame sizes.

  • NVMe Discovery service fails to return any targets.

  • Status of the NVMe-oF adapter in the vSphere Client shows as "Online" but no paths are discovered.

Resolution

1. Validate Physical and Virtual Link State

Ensure the physical path is active before testing the logical stack.

  • Verify the physical uplink (vmnic) is "Up" and negotiated at the correct speed: esxcli network nic get -n vmnicX

  • Confirm the VMkernel port (vmk) is associated with the correct Virtual Switch and physical uplink.

  • Check the ARP table to see if the ESXi host has learned the MAC address of the storage array: esxcli network ip neighbor list

    Note: If the MAC address is "Incomplete" or missing, the issue is at Layer 2 (VLAN tagging, cabling, or switch port configuration).

 

2. Confirm MTU Consistency

NVMe-oF is highly sensitive to MTU mismatches. Even if "Jumbo Frames" are enabled, a single device in the path (Host, Switch, or Array) at 1500 MTU will drop packets.

Perform a "Do Not Fragment" ping test to find the failure point:

  • Standard (1500 MTU): vmkping -I vmkX -d -s 1472 <Array_IP>

  • Jumbo (9000 MTU): vmkping -I vmkX -d -s 8972 <Array_IP>

 

3. Using the Network Connection Checker (Recommended)

This is the most reliable way to check if a specific port is open on the storage array from a specific VMkernel interface.

For NVMe Discovery Service (Port 8009): nc -i vmkX-z <Array_IP> 8009

For NVMe-over-TCP Traffic (Port 4420): nc -i vmkX-z <Array_IP> 4420
  • Success: Connection to <Array_IP> 8009 port [tcp/*] succeeded!

  • Failure: Connection refused or Connection timed out.

 

4. NVMe Specific Configuration

Unlike iSCSI, some NVMe controllers will not respond to ICMP pings until a valid NQN (NVMe Qualified Name) handshake or discovery attempt is initiated, or if the host is not yet mapped on the array side.

  • Retrieve Host NQN: esxcli nvme info get

  • Ensure this NQN is whitelisted/registered on the storage array.

  • Verify the NVMe-TCP or RDMA modules are loaded: esxcli system module list | grep nvme

 

5. RDMA / RoCE v2 Requirements (If applicable)

If using NVMe-over-RDMA, standard pings may fail if the RDMA fabric is not "Lossless."

  • Ensure Priority Flow Control (PFC) or Global Pause is configured on all physical switches.

  • Check if the RDMA device is active: esxcli network nic rdma device list

 

Expected Result

Once these steps are completed and verified, the NVMe Controllers/Drives will appear in the vSphere Client under Storage Adapters or as selectable PCI Devices.

Running the command esxcli nvme controller list should now return a list of active controllers, and the paths to the NVMe namespaces (LUNs) will be present and "Active" in the storage path view.

IF STEPS FAIL: Reach out to the customer's networking administrator to investigate potential firewall blocks or switch-level VLAN mismatches.

 

 

Additional Information

 

  • Port 8009: Ensure the NVMe-oF Discovery service port (default 8009) is not blocked by physical firewalls between the host and array.

  • VLAN Tagging: If using a tagged VLAN, ensure the VMkernel port has the ID set and the switch port is in "Trunk" mode.

Helpful Commands

 

Action

Command

Expected Result

Verify Port

nc -i vmkX -z <Pure_Target_IP> 8009

Connection succeeded

Verify MTU

vmkping -I vmkX -d -s 8972 <IP>

vmkping -I vmkX -d -s 1472 <IP>

0% packet loss Jumbo Frame

Verify Session

esxcli network ip connection list

ESTABLISHED

 Discovery

Manually trigger a pull from Array

esxcli nvme fabric discovery add

 Controller

Verify the host sees the Array

esxcli nvme controller list

 Device

Verify Volumes are presented

esxcli nvme namespace list