VMs connected to overlay segments have connectivity issues post V2T migration attempt.
search cancel

VMs connected to overlay segments have connectivity issues post V2T migration attempt.

book

Article ID: 386030

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • After a V2T Migration attempt VMs connected to NSX overlay segments lose layer 3 and cross host connectivity however local host traffic remains un-impacted.

  • If datapath is investigated it is observed that traffic reaches the local VDR instance of the gateway however traffic is not forwarded from the local host to the next hop (DR to SR) for layer 3 traffic northbound from TEP to TEP.
  • CDO mode is enabled and activated.

    • On a host level this can be checked by running following command on an impacted host "net-vdl2 -l | grep CDO"

Incorrect output: CDO status:     enabled (activated)

Expected output: CDO status:     enabled (deactivated)

    • On an environment wide level this can be checked via the API "https://<NSX-IP>/api/v1/global-configs/SwitchingGlobalConfig" and the global_replication_mode_enabled should be false (for deactivated).

Environment

All versions of VMware NSX T may be impacted.

Cause

This issue can be caused by an incorrect V2T workflow being carried out by a user. During the V2T process the CDO mode is temporarily enabled but is then deactivated at completion of the migration. However it is possible to re-start the migration and then not complete it fully, meaning the CDO mode is left enabled indefinitely. This can lead to hosts incorrectly attempting to use a non populated routing domain and as such blackhole TEP to TEP traffic.

Resolution

This is a condition that may occur in a VMware NSX environment.

If this scenario does occur and a V2T migration was prematurely ended, the global replication mode (CDO) can be reset manually by using the following API - api/v1/global-configs/SwitchingGlobalConfig

  1. Do a get API https://<NSX-IP>/api/v1/global-configs/SwitchingGlobalConfig and retrieve the content as below.
    {
        "physical_uplink_mtu": 8900,
        "uplink_mtu_threshold": 9000,
        "global_replication_mode_enabled": true,
        "remote_tunnel_physical_mtu": 8900,
        "arp_limit_per_lr": 50000,
        "resource_type": "SwitchingGlobalConfig",
        "id": "<Config ID>",
        "display_name": "<Config ID>",
        "_create_time": 1731508727168,
        "_create_user": "system",
        "_last_modified_time": 1731510195681,
        "_last_modified_user": "admin",
        "_system_owned": false,
        "_protection": "NOT_PROTECTED",
        "_revision": 1
    }

     

  2. Take the body response of step 1 and edit global_replication_mode_enabled from true to false, giving a body similar to the below.
    {
      "physical_uplink_mtu": 8900,
        "uplink_mtu_threshold": 9000,
        "global_replication_mode_enabled": false,
        "remote_tunnel_physical_mtu": 8900,
        "arp_limit_per_lr": 50000,
        "resource_type": "SwitchingGlobalConfig",
        "id": "<Config ID>",
        "display_name": "<Config ID>",
        "_create_time": 1731508727168,
        "_create_user": "system",
        "_last_modified_time": 1731510195681,
        "_last_modified_user": "admin",
        "_system_owned": false,
        "_protection": "NOT_PROTECTED",
        "_revision": 1
    }
  3. Push the changed config back with a POST to the same API https://<NSX-IP>/api/v1/global-configs/SwitchingGlobalConfig with the edited body from step 2.
  4. Check all overlay segments and change the replication_mode from source (Head End Replication) to MTEP (Hierarchical Two-Tier replication) where overlay segments whose replication_mode is source to be in line with recommended best practice. This can be done via the GUI or via CLI on a per segment basis.