HCX - High latency for traffic through NSX Tier-0 router for MON enabled VMs
search cancel

HCX - High latency for traffic through NSX Tier-0 router for MON enabled VMs

book

Article ID: 339575

calendar_today

Updated On:

Products

VMware HCX VMware NSX Networking VMware Cloud on AWS

Issue/Introduction

Identify a known issue with traffic forwarding on the NSX Edge for MON enabled VMs.

Symptoms:
Mobility Optimized Networking (MON) enabled VMs receiving traffic through the NSX Tier-0 router, into the Tier-1 router connected to a Cloud segment that has HCX L2 Network Extension enabled, may experience high latency and potentially packet loss.
High CPU on the NSX Edge may also be observed depending on the amount of traffic, for a single VM or the aggregated traffic for all MON enabled VMs.

Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x

Cause

A host /32 route is programmed on the Cloud side NSX-T for each MON enabled VM to allow for local egress when HCX L2 Network Extension services are enabled.
MON is primarily designed to optimize traffic between Cloud segments connected to the same Tier-1 router.
When traffic from a MON enabled VM follows a data path beyond the Tier-1 router into the Tier-0, for the return traffic the NSX Edge cannot program the next hop IP address for any MON host /32 route and it will have to resolve the address for every packet. The extra workload may cause significant impact to forwarding performance resulting in high latency.

Resolution

Upgrade to NSX version 3.1.3.7 or later.

Workaround:
Limit traffic for MON enabled VMs to only Cloud segments connected to the same Tier-1 router.

Additional Information

The following conditions will confirm the issue:
  • NSX-T running version 3.1.2.0 or earlier
  • Traffic for MON enabled VMs being routed beyond the NSX Tier-1 into the Tier-0
  • Error messages "Failed to add neigh route" appear constantly in NSX Edge /var/log/syslog
  • Tier-1 logical router forwarding table is missing the Gateway IP for the MON /32 host routes:
localhost> get logical-router 74fc7ec3-64b4-4308-9fca-33b1c381a24a forwarding
Wed Oct 13 2021 UTC 19:19:45.071
Logical Router
UUID VRF LR-ID Name Type
74fc7ec3-64b4-4308-9fca-33b1c381a24a 3 2 DR-cloud-tier1-router-1 DISTRIBUTED_ROUTER_TIER1
IPv4 Forwarding Table
IP Prefix               Gateway IP        Type        UUID                                      Gateway MAC
172.16.128.46/32        <missing>         route       539c2727-fb8b-4a3e-a69a-369f2901fbff      00:50:56:a2:ee:bb

Learning packets:
{
  "arp_cache_learning_req_enqueue_failures": 0,
  "arp_cache_learning_req_oom_failures": 0,
  "arp_cache_learning_req_processed": 0,
  "arp_learning_req_enqueue_failures": 0,
  "arp_learning_req_oom_failures": 0,
  "arp_learning_req_processed": 6608,

Note: All routes and identifiers are used for example purposes only. Actual identifiers will be different in your environment.

Impact/Risks:
  • This issue ONLY affects traffic for MON enabled VMs that is routed beyond the Tier-1 to the Tier-0.
  • Optimized traffic for MON enabled VMs within the same Tier-1 is NOT affected.
  • L2 traffic over the HCX L2 Network Extension is NOT affected.
  • Sustained high CPU on the NSX Edge may impact other routing functions.