SSL Handshake Failure and Failover Failure due to VLAN Trunk Misconfiguration on VCDA Tunnel Appliance
search cancel

SSL Handshake Failure and Failover Failure due to VLAN Trunk Misconfiguration on VCDA Tunnel Appliance

book

Article ID: 439489

calendar_today

Updated On:

Products

VMware vCenter Server VMware Cloud Director VMware Live Recovery

Issue/Introduction

When pairing sites or during standard replication operations in VMware Cloud Director Availability (VCDA) 4.x, you observe the following symptoms:

  • Attempts to pair Site A and Site B fail with the error: "Generic error during the SSL Handshake".
  •  In a multi-uplink environment (e.g., vDS with vmnic1 and vmnic7), the Tunnel appliance loses all connectivity when one physical uplink is brought down, even if the Replicator appliance on the same host fails over correctly.
  • The MAC address of the Tunnel appliance may appear on a physical switch port belonging to a different ESXi host at the network layer.
  • The issue persists only on specific ESXi hosts and disappears when the Tunnel VM is migrated to a different host.

Environment

  • VMware Cloud Director Availability 4.7.x
  • VMware ESXi 

Cause

This issue is caused by a physical network layer misconfiguration where the required replication VLANs are not consistently defined or allowed across all physical trunk ports (uplinks) connected to the ESXi host.

Specifically, if the replication VLAN is "pruned" or restricted on one physical uplink but not the other, the VCDA Tunnel appliance may become "pinned" to a specific path. When that path fails or the vDS attempts to balance traffic to the restricted uplink, the MAC learning fails, and the SSL handshake times out because the packets cannot reach the destination.

Resolution

To resolve this issue, reconfigure the physical switch ports to ensure that all replication-related VLANs are fully allowed on all uplinks serving the ESXi host.

  1. Audit Physical Switch Configuration: Coordinate with the network team to inspect the physical switch ports connected to the affected ESXi host's uplinks 
  2. Verify VLAN Trunking: Ensure that all VLAN IDs used by the VCDA Portgroups are included in the "allowed" list for these ports.
  3. Expand VLAN Definitions: Change the VLAN definition from a restricted trunk (specific IDs) to a full trunk (ensuring all necessary replication VLANs are explicitly allowed and tagged) across all participating physical switch ports.
  4. Disable Switch Port Security: Verify that features such as Port SecuritySticky MAC, or MAC Address Learning Limits are not blocking the VM's MAC address from moving between physical ports.
  5. After resolving the networking issues, retry the site pairing task from VCDA:

    For an environment with two Cloud Director sites, Pair or Re-pair the site.
    For an environment with On-Premises sites, Re-pair the remote site.