HCX - Host Route (/32) missing after enabling MON (Mobility Optimized Networking)
search cancel

HCX - Host Route (/32) missing after enabling MON (Mobility Optimized Networking)

book

Article ID: 321646

calendar_today

Updated On:

Products

VMware HCX VMware Cloud on AWS

Issue/Introduction

This resource is to inform about issues with HCX host route (/32) functionality in MON and how to recover that.

Symptoms:
For a given extended segment, HCX may remove and re-add static route (Host Route /32) against workload VMs as part of MON gateway switchover workflow.
Below events can be seen in HCX Cloud/Destination Manager logs for the host route (X.X.X.X)/32 against gateway IP (A.A.A.A)/24.
2023-10-04 12:08:27.927 UTC [NetworkStretchService_SvcThread-43597, j: 2365f2ae, s: ce38d858, , TxId: cac134a8-0dc2-4b44-b7c5-a458d5c1c00d] INFO  c.v.v.h.n.ConfigureHostRoutesOnEdgeJob- Removing {"network":"X.X.X.X\/32","next_hops":[{"admin_distance":1,"scope":["\/infra\/segments\/hcx-ne-c4fe9c02-bdef-4dad-87fe-cbde23f1f511"]}],"enabled_on_secondary":false,"resource_type":"StaticRoutes","id":"70f6e2d1-9f9d-44f9-bed0-534178c114c0","display_name":"HCX Policy based MON for Subnet A.A.A.A\/24","path":"\/infra\/tier-1s\/T1\/static-routes\/70f6e2d1-9f9d-44f9-bed0-534178c114c0","relative_path":"70f6e2d1-9f9d-44f9-bed0-534178c114c0","parent_path":"\/infra\/tier-1s\/T1"
2023-10-04 12:08:27.924 UTC [NetworkStretchService_SvcThread-43597, j: 2365f2ae, s: ce38d858, , TxId: cac134a8-0dc2-4b44-b7c5-a458d5c1c00d] WARN  c.v.v.h.n.ConfigureHostRoutesOnEdgeJob- nextHopSegmentPath is null but routesToAdd is non zero
2023-10-04 12:08:29.942 UTC [NetworkStretchService_SvcThread-43586, j: ddb479df, s: c1754c6a, , TxId: cac134a8-0dc2-4b44-b7c5-a458d5c1c00d] INFO c.v.v.h.n.ConfigureHostRoutesOnEdgeJob- Adding {"network":"X.X.X.X\/32","display_name":"HCX Policy based MON for Subnet A.A.A.A\/24","resource_type":"StaticRoute","next_hops":[{"scope":["\/infra\/segments\/hcx-ne-c4fe9c02-bdef-4dad-87fe-cbde23f1f511"]}]}
Location of App Engine log:
  • HCX Manager : /common/log/admin/app.log


Cause

When MON (Mobility Optimized Networking) is enabled on an extended segment, HCX configures static route to add the gateway IP with /32 net mask into NSX Tier-1 gateway for each workload VM that has MON enabled.
MON enablement is a 2 step process:
1. Enable MON per extended segment.
2. Perform gateway switchover job per VM for a given extended segment.

In current HCX implementations, when user triggers more than 2 jobs responsible for configuring static routes for multiple extended segments triggered through Network Extension wizard, it may result all static routes for one of the extended segment will be removed in the first iteration and re-added back in the second iteration within few seconds.

IMPORTANT: vMotion of the VMs performed due to vCenter DRS events will also trigger similar behavior where all static routes for extended networks hosting those VMs will be removed and re-added back within few seconds.

Resolution

This issue is fixed in HCX 4.8 release.
IMPORTANT: From this release, MON Static routes won't get removed and re-added as stated in the cause section.

Workaround:
The total time taken between remove and re-adding those static routes will depend upon below factors:
  1. MON enablement triggered for number of extended networks
  2. NSX Compute Resources to accept & process API calls for static route addition or removal.
Note: The static route will be re-added automatically within few seconds without any user intervention required, so the impact should be very minimal during the event.

However, In some cases the static route may remain missing for longer period under below conditions:
  • Static route remove/add API request from HCX may get rejected by NSX once it hits API rate limit on the NSX Manager.
Note: NSX Manager supports 100 API requests/sec.

In such case, below steps can be followed from Onprem/Connector HCX system to fix all missing static routes within the extended segment:
  1. Select one of the workload VM from the extended network in NE wizard which is having static route missing.
  2. Change the router location to Onprem and then change it back to cloud.
Alternatively, MON functionality can also be toggled to re-instate the Network Extension job followed by re-addition of static routes.

Additional Information

Impact/Risks:
  • There will be a traffic impact for few seconds due to unavailability of static route at cloud NSX Tier-1 for the duration between removing and re-adding of static routes for workload VMs part of a given extended segment.
  • No impact to any other extended networks/VMs where no activities have been performed by user.
  • Migration services will remain unaffected.