1. New Edge Transport Node is deployed from NSX manager, which in turn creates a Edge VM in the vCenter
2. When this new Edge Transport Node tries to register and it fails with "Registration Timeout" error on the NSX UI even though the communication between Edge and NSX managers are all fine (using ports 1234,1235,443)
3. Even when joined (join management-plane <Manager-IP> thumbprint <Manager-thumbprint> username admin) manually to NSX manager, still we will see the 'Registration Timeout' error in the UI
VMware NSX
VMware NSX-T Data Center
- This is caused due to expiring (within 30 days) or expired certificates on the NSX managers relating to API cert (Tomcat cert) and for VIP.
- In the NSX manager logs:
/var/log/syslog.log
We can see errors: "Certificate Validation failed"
and also: "Accept on endpoint 'ssl://0.0.0.0:1234 failed with error 167772294 certificate verify failed (SSL routines) from remote endpoint 'ssl tcp://x.x.x.x:45926"
To resolve this issue, we need to replace the expiring certificates (within 30 days) or expired certificates.
To replace the API (Tomcat) and Management cluster certificates following are the APIs:
--> Cluster API (mp-cluster) certificate is one cert for the whole NSX cluster
--> API (tomcat) certificate is one per manager node
To replace API certificate:
1. Create a self signed certificate: https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-9BBF8A54-DFBD-4B24-B7A1-492CB42DD0D5.html
2. Validate the certificate: GET https://<nsx-mgr>/api/v1/trust-management/certificates/<cert-id>?action=validate
3. To replace the certificate of manager node (tomcat) use the following API call: POST /api/v1/trust-management/certificates/<cert-id>?action=apply_certificate&service_type=API&node_id=<node-id>
(Perform the above 3 steps for the other 2 manager nodes)
To replace Cluster API certificate:
1. Create a self signed certificate: https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-9BBF8A54-DFBD-4B24-B7A1-492CB42DD0D5.html
2. Validate the certificate: GET https://<nsx-mgr>/api/v1/trust-management/certificates/<cert-id>?action=validate
3. To replace the certificate of manager node (tomcat) use the following API call: POST /api/v1/trust-management/certificates/<cert-id>?action=apply_certificate&service_type=MGMT_CLUSTER
In NSX 4.2.0 onwards there is an alternate option to replace directly on the UI itself but selecting the cert you want to replace with new certificated created:
System -> Certificates -> Actions -> Replace certificates
1. Select the old certificate
2. Click on Replace certificates
3. Select from drop down the new certificate that was created
4. click save and this will replace the old cert with new certificate.
Once the expired or expiring certificates (within 30 days) are replaced, a new Edge Transport Node can be deployed and this should complete successfully and should show the 'Success' state:
Reference: Types of Certificates
API (previously known as tomcat) | This is an API certificate used for external communication with individual NSX Manager nodes through UI or API. |
Cluster (previously known as mp-cluster) | This is an API certificate used for external communication with the NSX Manager cluster using the cluster VIP, through UI or API. |