.
"/api/v1/messaging/cluster-connection/status": { { "address": "ssl://##.##.##.##:1236", "conn_status": "Disconnected", "node_id": "9d####d3-####-433f-####-0####0####3c", "node_type": "APPLIANCE_PROXY_HUB" },{ "address": "ssl://##.##.##.##:1236", "conn_status": "Disconnected", "node_id": "dc####0c-####-4e83-####-3####4####15", "node_type": "APPLIANCE_PROXY_HUB" },{ "address": "ssl://##.##.##.##:1236", "conn_status": "Disconnected", "node_id": "b4####56-####-472a-####-f####1####a3", "node_type": "APPLIANCE_PROXY_HUB" },In /var/log/vmware/appl-proxy-rpc.log we see the following snippets:
YYYY-MM-DDTHH:MM:SS.MSZ #####n1.corp.####.org NSX 2469837 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="2469841" level="WARNING"] StreamConnection[5513751 Connecting to ssl://10.##.##.22:1236 sid:5513751] Couldn't connect to 'ssl://10.##.##.22:1236' (error: 335544539-short read)
YYYY-MM-DDTHH:MM:SS.MSZ ######n1.corp.####.org NSX 2469837 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="2469841" level="INFO"] StreamSocket[5513749 Open f:46 i:-237289773 ? -> ssl://10.##.##.23:1236] on_connect 335544539-short read
VMware NSX
The issue occurs when certificates are imported with extra characters between the end of one certificate and the beginning of the next certificate. This typically results in improperly formatted certificate chains, which leads to parsing errors.
BEGIN CERTIFICATE-----\n<redacted>-----END CERTIFICATE-----\n"#012 }#012}#012conn_cfg {#012 uuid {#012 left: ######17936196579#012 right: #######203377324339#012 }#012 node_type: APPLIANCE_PROXY_HUB#012 address {#012 addr {#012 ip_addresses {#012 ipv4: #####8705#012 }#012 }#012 port: 1236#012 }#012 certificate {#012 certificate: "-----BEGIN CERTIFICATE-----
This formatting issue prevents proper recognition and validation of certificates by the system, causing connection failures and errors.
The removal of such extra characters present in the imported certificate is fixed in NSX 4.2.0
Workaround:
1. Please run below commands on any of the Local Manager nodes. This will generate a certificate and private key for replacing APH-AR certificates openssl req -new -newkey rsa:2048 -days 3650 -nodes -x509 -keyout /tmp/test-key1.pem -out /tmp/test-cert1.pem -config /etc/vmware/nsx-appl-proxy/openssl-appl-proxy.cnf
2. Now log in to the manager UI. Goto System > Certificates > Import > Certificate
a. Name the certificate
b. Disable Service Certificate toggle
c. Copy test-cert1.pem to "Certificate Contents"
d. Copy test-key1.pem to "Private key"
e. Click on save
3. Obtain the certificate ID for the newly imported certificate from UI. The ID field should have the UUID.
4. Run the below API with the certificate ID and any one of the LM nodes. POST https://<nsx-mgr>/api/v1/trust-management/certificates/<cert-id>?action=apply_certificate&service_type=APH&node_id=<node-id>
5. Post replacement, validate on UI. Ensure that the "Where Used" field has "1" under the newly imported certificate.
6. Repeat the steps from 1-5 for the remaining 2 LM nodes.
7. Now, re-onboard the site using the below API,
POST https://<Active GM node IP>/api/v1/sites?action=onboard_site { “address”: “<LM node IP>”, “username”: “admin”, “password”: “<password>”, “thumbprint”: “<LM node thumbprint>”, “site_name”: “<site name>” }
8. Ensure that the LM Sync status is now "Connected."