VMware HCX relies on proper Maximum Transmission Unit (MTU) configuration to ensure optimal network performance for migrations and network extensions between environments. The MTU setting specifies the largest network packet that can traverse the infrastructure without fragmentation, but its configuration requires careful consideration of several critical factors.
HCX traffic inherently requires overhead for IPsec tunneling between the Network Extension (NE) and Interconnect (IX) appliances. This base overhead must be accounted for in all deployments to prevent fragmentation and ensure reliable connectivity. Additionally, HCX implements encryption to protect data in transit, which introduces another layer of overhead considerations.
With the release of HCX 4.10, VMware has introduced more granular control over this encryption layer. While HCX continues to encrypt all traffic by default - a design choice that ensures security when migrating workloads over public networks - administrators now have the flexibility to disable encryption when operating in secure environments. This capability becomes particularly valuable in private datacenters or when using dedicated connections where the additional encryption layer may be redundant.
The decision to enable or disable encryption has significant implications for MTU sizing. When enabled, encryption adds 28 bytes of overhead to each packet, which must be accounted for in the MTU configuration of every network component along the traffic path. This overhead becomes especially critical in high-throughput scenarios like bulk migrations or when extending networks with substantial traffic volumes between sites. Understanding these relationships between HCX components, encryption choices, and MTU requirements is fundamental to achieving optimal performance and preventing network fragmentation issues.
MTU misconfigurations in HCX environments typically arise from not fully accounting for all layers of encapsulation and encryption that may be present in the environment. Let's examine the common scenarios where these issues manifest and build up our understanding of the complete overhead stack.
At the most basic level, HCX requires IPsec tunnel overhead for its service appliances. The HCX Network Extension (NE) appliance creates an IPsec tunnel to carry extended network traffic between sites, while the HCX Interconnect (IX) appliance establishes IPsec tunnels for migration operations. This base IPsec encapsulation is always present and requires overhead for tunnel headers, forming the foundation of our MTU considerations.
When HCX encryption is enabled (the default setting), an additional 28 bytes of overhead is added for the encryption headers. This creates our first common scenario for MTU issues: administrators often configure their physical switches and VMkernel interfaces with a 1500-byte MTU, not realizing that the combination of IPsec tunnel overhead and encryption overhead will significantly reduce the available payload space.
A particularly challenging scenario occurs in environments where organizations implement their own encryption solutions alongside HCX. For example, if an organization uses their own IPsec VPN for site-to-site connectivity and then runs HCX with encryption enabled over this connection, the packets must accommodate both layers of encryption headers.
Consider this double-encryption scenario
This layered encryption can quickly consume a significant portion of the available MTU budget. A standard 1500-byte MTU becomes insufficient as each encryption layer adds its own overhead, potentially requiring jumbo frame support throughout the entire network path.
Cloud migrations introduce additional complexity due to the layered encapsulation requirements. When migrating to cloud environments, traffic must traverse
Multi-site deployments often reveal MTU issues when network paths cross diverse infrastructure boundaries. Consider a scenario where your network includes
Network extension performance problems frequently emerge during high-throughput operations when the full overhead stack exceeds initial planning. An environment might successfully pass basic connectivity tests with a 1600-byte MTU, but when handling large-scale network extension traffic through multiple layers of encryption and tunneling, the available payload space becomes insufficient for optimal performance.
In hybrid cloud scenarios, these issues become even more pronounced when traffic patterns change dynamically. An environment might work perfectly during initial testing with small payloads, only to experience fragmentation and performance issues when production workloads generate larger packets that must traverse multiple encryption layers.
Understanding these scenarios requires considering the complete encapsulation stack that may be present:
Only by accounting for all these potential elements can administrators properly size their MTU configurations to prevent fragmentation and ensure optimal performance. This often requires careful documentation of all encryption and tunneling solutions in the environment, as well as thorough testing under production-like conditions to verify the MTU settings can accommodate the full overhead stack.
Configuring MTU in a HCX environment requires careful consideration of both the base HCX networking requirements and the optional encryption capabilities introduced in HCX 4.10. Understanding encryption's impact on MTU becomes crucial as it directly affects the overhead requirements and overall performance of the HCX deployment.
When planning MTU configurations, organizations must evaluate their security requirements based on network topology and connectivity type. For traffic traversing WAN or public internet connections, VMware recommends maintaining at least one layer of encryption to ensure data security. However, for local traffic within secure datacenter environments or across trusted private connections, encryption may not be necessary and can be safely disabled to optimize performance.
In scenarios where multiple layers of encryption exist (such as HCX encryption running over an existing VPN), it is strongly recommended to disable one layer of encryption. Running multiple encryption layers not only impacts performance but also unnecessarily increases MTU overhead requirements. Organizations should evaluate their existing security measures and typically maintain the encryption layer that provides the most comprehensive coverage while disabling redundant encryption.
Once the encryption strategy is determined, several key configurations must be in place to manage encryption settings in HCX 4.10
This configuration addresses environments with secure private networks where HCX encryption is not required, accounting for base IPsec tunnel overhead.
This configuration represents the default and recommended setup for deployments traversing public networks or requiring enhanced security measures, accommodating both IPsec tunnel overhead and encryption overhead.
This configuration maximizes throughput in environments supporting end-to-end jumbo frames, providing sufficient overhead capacity for all tunneling and encryption requirements.
MTU configuration verification requires systematic testing through the following CLI commands
1. ssh
2. ccli
3. list
4. go <id>
5. pmtu
6. perftest all
Resolution of MTU-related issues requires systematic evaluation
Note: HCX provides two important performance enhancement features that work independently of encryption status
Generic Receive Offload (GRO), newly introduced in HCX 4.10, represents a significant advancement in improving Network Extension throughput. This feature intelligently combines multiple incoming packets into a single larger packet before delivering it to the network stack. Think of it like a postal service consolidating multiple small packages into one larger shipment – instead of processing many individual small packets, the system handles fewer but larger ones, significantly reducing processing overhead and improving overall throughput.
Application Path Resiliency, fundamentally enhances how HCX handles network path selection and traffic engineering. It serves as an intelligent traffic control system that continuously monitors all available network paths between your source and destination sites. By analyzing metrics like latency, available bandwidth, and packet loss in real-time, it automatically selects the optimal path for your traffic. When network conditions deteriorate on one path – perhaps due to congestion or other issues – the system can seamlessly redirect traffic to better-performing alternate paths without disrupting ongoing operations.
It is recommended to consider enabling both these features in your HCX deployments, regardless of your encryption choices. The combination of GRO's new packet processing optimization and Application Path Resiliency's proven intelligent routing creates a more robust and efficient environment for both migrations and network extensions. Together, these features help maintain optimal performance even in challenging network conditions.
It is also recommended to upgrade to the latest version of HCX as versions 4.10.2+ have known performance improvements.
Please refer to How to Troubleshoot and Fix Packet Loss Related to high CPU %RDY on HCX Network Extensions for additional troubleshooting steps related to Network Extension packet loss and CPU ready time issues, particularly in high-density VLAN environments.
Also see this article for more information - HCX is not migrating or passing traffic over network extensions at expected speeds