ESXi Host-Side Configuration and Cleanup for Dell Non-Disruptive Migration (NDM)
search cancel

ESXi Host-Side Configuration and Cleanup for Dell Non-Disruptive Migration (NDM)

book

Article ID: 434332

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms

Customers utilizing Dell PowerEdge servers configured for Boot from SAN (BFS) via Fibre Channel or iSCSI may encounter boot failures, "Host Not Responding" states, or management-layer timeouts when migrating OS boot volumes to new Dell storage arrays using the Non-Disruptive Migration (NDM) feature.

Potential risks during the transition include:

  • "Boot Device Not Found" errors if the server BIOS/UEFI fails to hand off to the new target WWN/LUN ID.

  • ESXi failing to initialize hostd or vpxa services if the Network Address Authority (NAA ID) changes unexpectedly.

  • Significant boot-time delays caused by the kernel attempting to scan retired legacy storage paths.


Purpose

This article provides a validated technical procedure for ensuring host recoverability and boot persistence on the ESXi side when performing a storage-layer migration using Dell proprietary NDM processes.

Environment

Product: VMware vSphere ESXi

Version: 6.x, 7.x, 8.x

Hardware: Dell PowerEdge Servers

Cause

While Dell NDM is designed to be non-disruptive to data traffic, a Boot LUN migration involves a hardware-level dependency. The server's Host Bus Adapter (HBA) must recognize the new array's paths during the initial Power-On Self-Test (POST). Any discrepancy in the LUN ID (which must remain LUN 0) or the WWN presentation can result in a boot failure.

Furthermore, ESXi identifies boot partitions by UUID. If the NDM process results in an "All Paths Down" (APD) condition before new paths are committed, the management agents may fail to initialize.

Resolution

1. Pre-Migration Safeguard: Configuration Backup

Before starting the NDM process, generate a configuration backup. This acts as a "Safety Net," allowing for a rapid host restore should the physical boot device become corrupted or inaccessible during the migration.

  1. Log in to the ESXi Shell or SSH as root.

  2. Run the following command: vim-cmd hostsvc/firmware/backup_config

  3. The command provides a URL. Replace the * with the host's management IP in a web browser to download the .tgz bundle.

  4. Store this bundle in a secure, off-host location.

2. Migration Phase: Identity and LUN Alignment

Coordinate with Dell Storage Support to utilize the "Native Migration" workflow. Ensure the following parameters are maintained to minimize BIOS/UEFI reconfiguration:

  • Identity Retention: Use the -move_identity flag to preserve volume attributes and WWNs.

  • LUN ID Consistency: The target volume must retain LUN ID 0.

3. Post-Migration: Path Management and Sanitization

Once the migration reaches the "Synchronized" state and the cutover is initiated, the host must be updated to reflect the new storage topology.

  1. Discovery: Perform a manual storage rescan to detect the new endpoints: esxcli storage core adapter rescan --all

  2. Identify Dead Paths: Locate the NAA ID of the legacy (source) array disks that are no longer in use: esxcli storage core device list | grep -i "off" -B 5

  3. Sanitization: Unregister the "Dead" or "Off" paths to prevent long-term management timeouts and slow boot times: esxcli storage core device set --state=off -d <Old_NAA_ID>

  4. Hardware Verification: If a reboot is required, enter the Dell F2 System Setup to verify the UEFI Boot Manager is pointing to the correct new HBA/Target path.

Additional Information