After Running CARR 1.21 losses connectivity to the edges and host

Products

VMware NSX

Issue/Introduction

After we ran Using CARR 1.21, we lost connection to the transport nodes. You might see similar errors for the edges in the NSX UI:

HOST

EDGE

Environment

VMware NSX

Cause

When running the CARR 1.21 script in fix mode, it was unable to properly replace the NSX Manager APH_TN certificates which resulted in the NSX transport nodes being unable to connect to the Managers.

Resolution

Fix will be add it to a newer release of CARR

Workaround:

Run the CARR Script to fix the certificated on the NSX manager:

Using Certificate Analyzer, Results and Recovery (CARR) Script to fix certificate related issues in NSX

If we still see that the host show as disconnected after renewing the certificates on the NSX manager then follow the next steps:

Get a putty session to an NSX manager as root and run the following command:

get certificate api thumbprint

After we have the thumbprint of a manager, open a putty session to the ESXI that is being impacted and run the following commands

nsxcli

Once there we are going to have to gather information for the next commands

Manager FQDN
Manager api thumbprint ( this is gather on the step above with get certificate)
User ( this should be admin)
Password

Command that will be need to be run on the impacted host

push host-certificate <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>
sync-aph-certificates <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>

Once both commands are ran, type exit to get out of nsxcli and then run the following commands

/etc/init.d/nsx-proxy restart
/etc/init.d/nsx-opsagent restart

You may need to click Resolve in the NSX UI under the Transport Nodes status to clear the error after these steps.

For the Edge

Run the following commands

Get a putty session to an NSX manager as root and run the following command:
- get certificate api thumbprint
After we have the thumbprint of a manager, open a putty session to the edge and log in as admin
Once there we are going to have to gather information for the next commands
- Manager FQDN
- Manager api thumbprint ( this is gather on the step above with get certificate)
- User ( this should be admin)
- Password
Command that will be need to be run on the impacted host
- push host-certificate <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>
- sync-aph-certificates <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>
Once both commands are ran, type st en to get into root and then run the following commands
- /etc/init.d/nsx-proxy restart
- /etc/init.d/nsx-opsagent-appliance restart

For this component give it a couple minutes and you should be able to see them come back and green

Additional Information

Using Certificate Analyzer, Results and Recovery (CARR) Script to fix certificate related issues in NSX