This article provides troubleshooting steps to isolate common infrastructure issues that may impact HCX Site Pairing communication between the HCX Connector and Cloud Manager appliances.
The same steps are applicable for HCX Cloud to Cloud Site Pairings but access restrictions may apply from the Cloud Provider.
SocketTimeoutException Read timed out
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>502 Proxy Error</title> </head><body> <h1>Proxy Error</h1> <p>The proxy server received an invalid response from an upstream server.<br /> The proxy server could not handle the request<p>Reason: <strong>Error reading from remote server</strong></p></p> </body></html
VMware HCX
Site Pairing connectivity between HCX Managers will depend entirely on the underlying network infrastructure, so problems with basic routing, firewall configuration, or proxy settings can disrupt that communication.
When Site Pairing is down or not getting established for the first time, check the following:
curl -k -v https://<HCX_Manager_FQDN>
* Trying #.#.#.#...
* TCP_NODELAY set
* Connected to <HCX_Manager_FQDN> (#.#.#.#) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
* subject: C=US; ST=California; L=Palo Alto; O=VMware, Inc; OU=Hybridity; CN=<HCX_Manager_FQDN>
* start date: Jun 18 05:42:30 2021 GMT
* expire date: Jun 18 05:42:27 2022 GMT
* issuer: C=US; O=Entrust, Inc.; OU=See <Website link>/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
* SSL certificate verify ok.
> GET / HTTP/1.1
> Host: <HCX_Manager_FQDN>
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 302
< Date: Wed, 20 Apr 2022 18:13:26 GMT
< Server: Apache
< Location: https://<HCX_Manager_FQDN>/hybridity/ui/hcx-client/index.html
< Content-Length: 0
< Content-Security-Policy: style-src 'self' 'unsafe-inline'; font-src 'self' data:; img-src 'self' data:
<
* Connection #0 to host <HCX_Manager_FQDN> left intact
* Closing connection 0
IMPORTANT: By default, when a Proxy server is configured, the Connector or Cloud Manager uses it for all HTTPS connections (including communication to the local vCenter Server, ESXi, NSX, and HCX IX and NE appliances over the Management Network). Therefore, required entries in the exclusion list must be included to allow direct access to local network resources. Also, app and web engine restart is required for any changes in the proxy configuration to take effect.
traceroute
" from the HCX Connector or Cloud Manager via SSH. This command requires "root" access for execution.# su - root
# traceroute <HCX_Cloud_Manager_IP>
Perform the packet captures from each HCX manager and connector while setting the site pairing to observe issues such as MTU and analyze the packet captures using tools such as Wireshark.
From the HCX connector appliance:
tcpdump -n -i eth0 host <HCX cloud IP> and host <HCX Connector IP> -w /tmp/HCX_connector_pkt_capture.pcap
From the HCX Cloud appliance
tcpdump -n -i eth0 host <HCX cloud IP> and host <HCX Connector IP> -w /tmp/HCX_cloud_pkt_capture.pcap
Note:
If there are no underlying issues, site pairing via the API should successfully re-establish the connection. It has been observed that due to stale entries in the 'RemotingOutbox' collection due to the site pairing being down, the site pairing via the API may not help. In such cases, it is necessary to check the status of the 'RemotingOutbox' in the HCX database and clear any outdated entries. Please contact Broadcom Support for more information on this: Contact Broadcom support
Workaround:
There is no workaround to have full HCX services available without site pairing connectivity between data center sites.
HCX Site pairing fails with error "NumberFormatException" (80210)
Resync HCX service mesh: "Error in communicating with remote side to find NSX types" (328952)
Impact/Risks:
If the Site Pairing is down, configuration workflows will fail and no migrations can be scheduled from HCX Connector or source Cloud Manager.
Existing Network Extension services will remain active indefinitely but no configuration changes can be made on those, except for "unstretch", which can be forced from the target HCX Cloud Manager's side.