HCX - Site Pairing Connectivity Diagnostics
search cancel

HCX - Site Pairing Connectivity Diagnostics

book

Article ID: 321629

calendar_today

Updated On:

Products

VMware HCX VMware Cloud on AWS

Issue/Introduction

Provide troubleshooting steps to isolate common infrastructure issues that may impact HCX Site Pairing communication between HCX Connector and Cloud Manager.
The same steps are applicable for HCX Cloud to Cloud Site Pairings but access restrictions may apply from the Cloud Provider.

HCX Site Pairing is not established after initial configuration or going down unexpectedly after being in service.

SocketTimeoutException Read timed out

The following error was received during configuring site pairing.

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>502 Proxy Error</title> </head><body> <h1>Proxy Error</h1> <p>The proxy server received an invalid response from an upstream server.<br /> The proxy server could not handle the request<p>Reason: <strong>Error reading from remote server</strong></p></p> </body></html

Environment

HCX

Cause

Site Pairing connectivity between HCX Managers will depend entirely on the underlying network infrastructure, so problems with basic routing, firewall configuration, or proxy settings can disrupt that communication.

Resolution

When Site Pairing is down or not getting established for the first time, check the following:

  • Date and Time should be the in sync on both HCX Managers, depending on the timezone for each system.
  • Verify NTP configuration on both HCX Managers.
  • Verify DNS resolution is working as expected.
  • Ensure the right vCenter credentials are used for the respective remote HCX Cloud Manager.
  • Site Pairing to an HCX Cloud Manager in a VMware Cloud on AWS SDDC must use "cloudadmin" credentials.
  • The HCX Cloud Manager local "admin" account can NOT be used for Site Pairing authentication.
  • SSH into the HCX Connector or Cloud Manager to test connectivity to the remote HCX Cloud Manager's URL over TCP 443. There will be a clear indication of a successful SSL/HTTP session:
curl -k -v https://<HCX_Manager_FQDN>

*   Trying #.#.#.#...
* TCP_NODELAY set
* Connected to <HCX_Manager_FQDN> (#.#.#.#) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=Palo Alto; O=VMware, Inc; OU=Hybridity; CN=<HCX_Manager_FQDN>
*  start date: Jun 18 05:42:30 2021 GMT
*  expire date: Jun 18 05:42:27 2022 GMT
*  issuer: C=US; O=Entrust, Inc.; OU=See <Website link>/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: <HCX_Manager_FQDN>
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 302 
< Date: Wed, 20 Apr 2022 18:13:26 GMT
< Server: Apache
< Location: https://<HCX_Manager_FQDN>/hybridity/ui/hcx-client/index.html
< Content-Length: 0
< Content-Security-Policy: style-src 'self' 'unsafe-inline'; font-src 'self' data:; img-src 'self' data:
< 
* Connection #0 to host <HCX_Manager_FQDN> left intact
* Closing connection 0
  • If above test fails, ensure TCP 443 communication is allowed between HCX Managers:
    • Check Firewall and/or DFW configuration.
    • Check Proxy settings on the source side.
  • If using a Proxy, only Basic Authentication is supported. Kerberos or any other type are not supported.
  • If there is an error message in the HCX UI about an untrusted connection, add the remote HCX Manager or the Proxy server certificate to the source HCX Connector or Cloud Manager via the HCX AUI. Refer to Managing CA and Self-Signed Certificates

 IMPORTANT:  By default, when a Proxy server is configured, the Connector or Cloud Manager uses it for all HTTPS connections ( including communication to local vCenter Server, ESXi, NSX, and HCX IX and NE appliances over the Management Network ) therefore, required entries in the Exclusion list must be included to allow direct access to local network resources. Also, app and web engines restart is required for any changes in the Proxy configuration to take effect.

  • Ensure there is no asymmetric routing by running "traceroute" from the HCX Connector or Cloud Manager via SSH. This command requires "root" access for execution.
IMPORTANT: SSH access to the HCX Cloud Manager in an SDDC may be restricted by the Cloud Provider. This option may only be run from HCX Connector in those cases.
# su - root
# traceroute <HCX_Cloud_Manager_IP>
  • Try re-registering it using the HCX UI.
  • Try re-creating it using the HCX UI.
Refer to the HCX User Guide for Site Pairing configuration details.
 

Perform the packet captures from each HCX manager and connector while setting the site pairing to observe issues such as MTU and analyze the packet captures using tools such as Wireshark. 

From HCX connector: 

tcpdump -n -i eth0 host <HCX cloud IP> and host <HCX Connector IP> -w /tmp/HCX_connector_pkt_capture.pcap


From HCX Cloud 

tcpdump -n -i eth0 host <HCX cloud IP> and host <HCX Connector IP> -w /tmp/HCX_cloud_pkt_capture.pcap


Note:
If there are no underlying issues, site pairing via the API should successfully re-establish the connection. But it has been also observed that due to stale entries in the 'RemotingOutbox' collection due to the site pairing being down issue's the site pairing via the API may not help. In such cases, it is necessary to check the status of the 'RemotingOutbox' in the HCX database and clear any outdated entries. Please contact Broadcom Support for more information on this:  Contact Broadcom support

Workaround:
There is no workaround to have full HCX services available without site pairing connectivity between data center sites.

Additional Information

HCX site pairing fails with error "NumberFormatException" (80210)
HCX - Site pairing disconnected with "Error queuing Job: Workflow ReplicationTransferJob" (81978)
HCX - Resync service mesh: "Error in communicating with remote side to find NSX types" (328952)


Impact/Risks:
If the Site Pairing is down, configuration workflows will fail and no migrations can be scheduled from HCX Connector or source Cloud Manager.
Existing Network Extension services will remain active indefinitely but no configuration changes can be made on those, except for "unstretch", which can be forced from the target HCX Cloud Manager's side.