NSX V2T Migration Fails with DNS Resolution Error for vCenter During Controller Recovery
search cancel

NSX V2T Migration Fails with DNS Resolution Error for vCenter During Controller Recovery

book

Article ID: 409815

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

During NSX V2T migration, the migration process fails at the "Migrate Edges" phase with controller connectivity timeouts. The rollback process also fails with identical errors, creating an unrecoverable state requiring manual intervention.

The issue occurs when DNS services reside on NSX segments being migrated. During the migration, NSX-V controllers are powered down and new NSX controllers need to be powered on. However, the power-on operation fails because NSX Manager cannot resolve the vCenter Server FQDN, as DNS becomes unavailable due to the migration process itself.

Verify this issue by checking the following log files:

cm.log (located at /var/log/migration-coordinator/v2t/cm.log):

Example log entries showing the issue:

2025-08-30 01:17:46,576 956442 CM.clients.v_utils INFO GM Nsx-V cntlr controller-5 is not running yet, status: CONNECTION_FAILED
2025-08-30 01:19:46,800 956442 CM.plugins.ns_cutover_plugin ERROR GM Timeout while waiting for controllers to get connected. Aborting...
2025-08-30 01:09:45,488 956442 CM.clients.base_client DEBUG GM "statusMessage":"Failed to power on VM <controller-name>: <vcenter-fqdn>: Temporary failure in name resolution. Root Cause: The task failed on VC. For more details, refer to the rootCauseString or the VC logs"
2025-08-30 00:50:21,237 929645 CM.clients.base_client ERROR GM Failed to get https://<nsx-manager-fqdn>/remote/af6745bd-9e13-49b3-b9ff-eb3664c7a4ed/api/v1/logical-switches/11155e6a-f27c-470d-9232-477db9e5a0cd with status: 429 and reason: {"module_name":"common-services","error_message":"Client 'admin' exceeded request rate of 100 per second","error_code":102}

Use grep to find relevant entries:

grep -E "CONNECTION_FAILED|Timeout while waiting|Temporary failure in name resolution|exceeded request rate" /var/log/migration-coordinator/v2t/cm.log

migration-coordinator.log (located at /var/log/migration-coordinator/migration-coordinator.log):

Example log entries:

2025-08-30T01:19:54.837Z  INFO http-nio-127.0.0.1-7450-exec-2 ExecutionMonitorServiceImpl 3673077 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Execution monitor service invoked to react to failure of node EDGE [Rollback failed. Check rollback.log]
2025-08-30T01:19:54.871Z  INFO http-nio-127.0.0.1-7450-exec-2 FacadeInterceptorHelperImpl 3673077 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Caught error in facade interceptor
com.vmware.nsx.management.upgrade.exceptions.UpgradeUnitUpgradeException: null
2025-08-30T01:19:54.871Z  INFO http-nio-127.0.0.1-7450-exec-2 NsxBaseRestController 3673077 SYSTEM [nsx@6876 audit="true" comp="global-manager" level="INFO" subcomp="migration-coordinator"] UserName:'admin' ModuleName:'migration-coordinator' Operation:'POST@/api/v1/migration/plan' Operation status: 'failure' Error: Rollback failed. Check rollback.log file for more details [Reason: [ns_cutover_plugin] Can not proceed with migration: Timeout while waiting for controllers to get connected.].

Use grep to find relevant entries:

grep -E "Rollback failed|UpgradeUnitUpgradeException|ns_cutover_plugin|controllers to get connected" /var/log/migration-coordinator/migration-coordinator.log

rollback.log (located at /var/log/migration-coordinator/v2t/rollback.log):

Example log entries showing rollback errors:

ERRORS:
{'category': 'NSCutover', 'error': '[ns_cutover_plugin] Can not proceed with migration: Timeout while waiting for controllers to get connected.'}

summary.log (located at /var/log/migration-coordinator/v2t/summary.log):

Example log content:

{
    "stage": "revert",
    "status": "error",
    "sub_stage": "prepare-infra",
    "errors": [
        {
            "category": "NSCutover",
            "error": "[ns_cutover_plugin] Can not proceed with migration: Timeout while waiting for controllers to get connected."
        }
    ],
    "iteration": 1
}

Environment

  • VMware NSX-V (all versions)
  • VMware NSX Data Center (all versions)
  • VMware vCenter Server 7.x and 8.x
  • Configurations where DNS servers are hosted on NSX-managed segments

Cause

DNS services hosted on NSX-V segments create a circular dependency during V2T migration. When the migration process begins, NSX-V controllers are powered down as part of the migration procedure. The migration then attempts to power on new NSX controllers, but this operation requires NSX Manager to communicate with vCenter Server. Since DNS services are hosted on the NSX segments being migrated, DNS resolution becomes unavailable, preventing NSX Manager from resolving the vCenter FQDN. This causes the PowerOnSvm tasks to fail with "Temporary failure in name resolution" errors, resulting in both migration and rollback procedures being unable to complete.

Resolution

Before starting any NSX V2T migration:

  1. Identify DNS and Gateway Dependencies
    • Determine if DNS servers are hosted on NSX-managed segments
    • Verify NSX Manager's default gateway location
    • Confirm DNS server's default gateway location
    • Document all management plane network paths
  2. Relocate Critical Services Before Migration
    • Move DNS servers to non-NSX managed networks:
      a. Deploy temporary DNS servers on physical network or non-NSX VMs
      b. Update NSX Manager DNS configuration to point to new DNS servers
      c. Update vCenter Server DNS configuration
      d. Verify DNS resolution from NSX Manager console
  3. Ensure Gateway Accessibility
    • Move management gateways to segments that will remain available during migration
    • Configure alternative routing paths for management traffic if needed
  4. Validate Connectivity Before Migration
    • From NSX Manager, test DNS resolution:
      nslookup <vcenter-fqdn>
      ping <vcenter-ip>
    • Verify vCenter API connectivity
    • Confirm all management components can communicate
  5. Proceed with Migration
    • Only start the migration after confirming DNS and management connectivity will remain stable
    • Monitor DNS resolution throughout the migration process
    • Check /var/log/migration-coordinator/v2t/cm.log for controller status during migration

If the issue has already occurred:

  1. Manually restore DNS services on alternative infrastructure
  2. Power on NSX controllers manually through vCenter if accessible
  3. Re-establish management network connectivity
  4. Retry the rollback operation once DNS is restored

If the error persists after following these steps, contact Broadcom Support for further assistance.

When opening a support request with Broadcom for this issue, provide:

  • Screenshots of errors seen in UI and logs
  • NSX Manager logs from the paths mentioned above
  • Confirmation that DNS connectivity and management gateway are not hosted on NSX controller segments (as either condition can cause DNS to become unreachable during migration)