HCX Interconnect Underlay PMTU mismatch alerts due to CPU contention
search cancel

HCX Interconnect Underlay PMTU mismatch alerts due to CPU contention

book

Article ID: 439344

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

  • You experience recurring HCX Interconnect Underlay Path MTU (PMTU) mismatch alerts for uplinks. These alerts occur consistently at 15-minute intervals. The system logs show the MTU repeatedly dropping from the expected value (e.g., 1350) to lower values (e.g., 1343 or 923) and subsequently recovering.
  • Example event entries in HCX UI:
    HCX Interconnect Underlay PMTU mismatch found for the uplink <DV Portgroup> Previous DiscoveredMTU : 1350 Current DiscoveredMTU : 1343.
    HCX Interconnect Underlay PMTU mismatch found for the uplink <DV Portgroup> Previous DiscoveredMTU : 1343 Current DiscoveredMTU : 1350.
    HCX Interconnect Underlay PMTU mismatch found for the uplink <DV Portgroup> Previous DiscoveredMTU : 1350 Current DiscoveredMTU : 923.
    HCX Interconnect Underlay PMTU mismatch found for the uplink <DV Portgroup> Previous DiscoveredMTU : 923 Current DiscoveredMTU : 1350.

 

Cause

  • The underlying cause may be a severe CPU resource contention on the source or destination cluster where IX appliances reside. The HCX Interconnect (IX) appliance struggles to acquire sufficient CPU cycles, preventing it from responding to the periodic MTU discovery checks in a timely.
  • The PMTU discovery mechanism probes the network path every 15 minutes. If a response is delayed due to resource starvation on the congested hosts (operating at 90% to 117% CPU utilization), the system retries with a lower MTU size until a response is successfully received. It then restores the value upon the next successful check.

Resolution

  • Investigate and remediate the severe CPU resource contention currently impacting the cluster.
  • Restoring adequate CPU resources ensures the HCX IX appliance has the compute capacity needed to promptly process and respond to the 15-minute PMTU discovery probes. This stabilizes the discovered MTU at its expected byte size and halts the recurring mismatch alerts.

Additional Information

A manual live PMTU check can confirm if the underlying network path is capable of handling the expected MTU. If live manual checks consistently return the expected MTU response, it proves the physical network is not the bottleneck and the issue stems from resource starvation delaying the automated appliance responses.