ESXi hosts lose connectivity to NVMe over TCP datastores after storage controller IP change
search cancel

ESXi hosts lose connectivity to NVMe over TCP datastores after storage controller IP change

book

Article ID: 441069

calendar_today

Updated On:

Products

VMware vSphere ESX 8.x

Issue/Introduction

  • ESXi hosts in a cluster lose access to VMFS datastores backed by NVMe over TCP storage due to maintenance on the storage array where controller IP addresses are changed

  • All the datastores report All Paths Down state, resulting in an outage for all hosted VMs.

  • The NVMe controllers report as offline
     [root@hostname:~] esxcli nvme controller list

    Name                                                                                                                          Controller Number  Adapter  Transport Type  Is Online  Controller Type  Is VVOL  Keep Alive Timeout  IO Queue Number  IO Queue Size
    ----------------------------------------------------------------------------------------------------------------------------  -----------------  -------  --------------  ---------  ---------------  -------  ------------------  ---------------  -------------
    nqn.2010-06.com.######:#####:######vmhbaX#<target_IP>:4420                   260  vmhbaX  TCP                 false  I/O                false                  10                1           32
    nqn.2010-06.com.######:#####:######vmhbaY#<target_IP>:4420                    271  vmhbaY  TCP                 false I/O                false                  10                1           32

Environment

VMware vSphere ESXi 8.x

Cause

When storage controller IPs are modified on the SAN side, ESXi hosts may fail to automatically update the fabric connection details. The hosts often hang on stale paths associated with the old IP addresses.

Resolution

To restore connectivity, the storage controller details must be manually updated to reflect the new IP addresses.

1. Access the ESXi host via SSH and identify the affected controllers:
esxcli nvme controller list

2. Disconnect the hung fabric sessions:
esxcli nvme fabrics disconnect -a <Adapter_Name> -s <Subsystem_NQN>

3. Update Controller IP Details

  • Via vSphere Client: Navigate to Host > Configure > Storage > Storage Adapters. Select the NVMe over TCP adapter, go to the Controllers tab, and manually add or update the storage IP and port (default 4420).

  • If the vCenter UI is unavailable, manually connect to the new storage IP via ESXi cli
    esxcli nvme fabrics connect -a <vmhba_adapter> -i <target_IP> -p 4420 -s <subsystem_nqn>

4. Rescan Storage Adapters
esxcli storage core adapter rescan --all

5. Storage controllers should appear online

 [root@hostname:~] esxcli nvme controller list

Name                                                                                                                          Controller Number  Adapter  Transport Type  Is Online  Controller Type  Is VVOL  Keep Alive Timeout  IO Queue Number  IO Queue Size
----------------------------------------------------------------------------------------------------------------------------  -----------------  -------  --------------  ---------  ---------------  -------  ------------------  ---------------  -------------
nqn.2010-06.com.######:#####:######vmhbaX#<target_IP>:4420                   260  vmhbaX  TCP                  true  I/O                false                  10                1           32
nqn.2010-06.com.######:#####:######vmhbaY#<target_IP>:4420                   271  vmhba6Y  TCP                 true  I/O                false                  10                1           32