VMware HCX-IX tunnels remain down after firewall maintenance
search cancel

VMware HCX-IX tunnels remain down after firewall maintenance

book

Article ID: 430832

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

After a scheduled firewall maintenance window, the existing VMware HCX Interconnect (HCX-IX) tunnels to Azure VMware Solution (AVS) remain in a "Down" state within the HCX Manager UI.

  • Resyncing the service meshes or restarting the HCX services does not reconnect the existing tunnels.
  • Creating a new service mesh with the same Network and Compute Profiles works successfully, indicating no active traffic is currently being blocked by the firewall.
  • Attempts to redeploy the existing service mesh fail with an error because there are Bulk Migrations in progress.

Environment

VMware HCX

Cause

This issue is caused by stale stateful sessions on the transit firewall. During the firewall maintenance or interruption, the state table entries for the specific streams used by the HCX-IX and Network Extension (NE) appliances became hung or desynchronized. Because the firewall retains these stale session entries, it drops or mishandles the continuous probe packets sent by the mesh appliances, preventing the persistent tunnel from re-establishing automatically.

Resolution

To resolve this issue and restore the existing tunnels without canceling active Bulk Migrations, you must clear the stale sessions from your firewall:

  1. Log in to your firewall's management interface.
  2. Identify the IP addresses of the HCX-IX and NE appliances associated with the affected Service Mesh.
  3. Review the firewall's active session table and locate any stateful sessions involving these appliance IP addresses.
  4. Force terminate (clear) those specific sessions on the firewall.
  5. Once the stale sessions are cleared, navigate back to the HCX Manager UI and resync the Service Mesh.

The mesh appliances will automatically send new probe packets, establish a fresh stateful session on the firewall, and bring the tunnels back up. Any Bulk Migrations that were in progress will resume an active synchronization state.