HCX - High latency for traffic through NSX Tier-0 router for MON enabled VMs
book
Article ID: 339575
calendar_today
Updated On:
Products
VMware HCXVMware NSXVMware Cloud on AWS
Issue/Introduction
Identify a known issue with traffic forwarding on the NSX Edge for MON enabled VMs.
Symptoms: Mobility Optimized Networking (MON) enabled VMs receiving traffic through the NSX Tier-0 router, into the Tier-1 router connected to a Cloud segment that has HCX L2 Network Extension enabled, may experience high latency and potentially packet loss. High CPU on the NSX Edge may also be observed depending on the amount of traffic, for a single VM or the aggregated traffic for all MON enabled VMs.
Environment
VMware NSX-T Data Center VMware NSX-T Data Center 3.x
Cause
A host /32 route is programmed on the Cloud side NSX-T for each MON enabled VM to allow for local egress when HCX L2 Network Extension services are enabled. MON is primarily designed to optimize traffic between Cloud segments connected to the same Tier-1 router. When traffic from a MON enabled VM follows a data path beyond the Tier-1 router into the Tier-0, for the return traffic the NSX Edge cannot program the next hop IP address for any MON host /32 route and it will have to resolve the address for every packet. The extra workload may cause significant impact to forwarding performance resulting in high latency.
Resolution
Upgrade to NSX version 3.1.3.7 or later.
Workaround: Limit traffic for MON enabled VMs to only Cloud segments connected to the same Tier-1 router.
Additional Information
The following conditions will confirm the issue:
NSX-T running version 3.1.2.0 or earlier
Traffic for MON enabled VMs being routed beyond the NSX Tier-1 into the Tier-0
Error messages "Failed to add neigh route" appear constantly in NSX Edge /var/log/syslog
Tier-1 logical router forwarding table is missing the Gateway IP for the MON /32 host routes:
localhost> get logical-router 74fc7ec3-####-####-####-########24a forwarding
Wed Oct 13 2021 UTC 19:19:45.071
Logical Router
UUID VRF LR-ID Name Type
74fc7ec3-####-####-####-########24a 3 2 DR-cloud-tier1-router-1 DISTRIBUTED_ROUTER_TIER1
IPv4 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
172.16.128.46/32 <missing> route ########-####-####-####-########fbff ##:##:##:##:##:bb
Learning packets:
{
"arp_cache_learning_req_enqueue_failures": 0,
"arp_cache_learning_req_oom_failures": 0,
"arp_cache_learning_req_processed": 0,
"arp_learning_req_enqueue_failures": 0,
"arp_learning_req_oom_failures": 0,
"arp_learning_req_processed": 6608,
Note: All routes and identifiers are used for example purposes only. Actual identifiers will be different in your environment.
Impact/Risks:
This issue ONLY affects traffic for MON enabled VMs that is routed beyond the Tier-1 to the Tier-0.
Optimized traffic for MON enabled VMs within the same Tier-1 is NOT affected.
L2 traffic over the HCX L2 Network Extension is NOT affected.
Sustained high CPU on the NSX Edge may impact other routing functions.