"Control Channel To Transport Node Down Long" or "Management Channel to Transport Node Down Long" alarms for removed Transport Nodes

Products

VMware NSX

Issue/Introduction

When accessing the NSX UI, you may observe multiple alarms labeled "Control Channel To Transport Node Down Long" and / or "Management Channel to Transport Node Down Long" in the Alarms section.
The hosts or cluster the host belonged to have been removed from vCenter.
Searching for the Transport Node name referenced in the alarm may still show as a TN however when the name is clicked, "No clusters available" may display.

Upon investigation, the UUID referenced in these alarms no longer corresponds to any active nodes in your environment even though it may show as a TN when searched using the universal search. Additionally, when navigating to System > Fabric > Hosts, both the Other Nodes and Standalone tabs may show no associated hosts (see images below).

Standalone:

Other Nodes:

Environment

VMware NSX-T Data Center
VMware NSX

Cause

This behavior is caused by stale entries remaining in the Corfu database for Hosts that were removed. These entries are still recognized by NSX as active configurations, which results in alarms being triggered due to the lack of connectivity.

This issue commonly occurs when NSX is not properly removed from hosts before they are deleted or decommissioned.

Resolution

To resolve these alarms and remove Stale Hosts from UI follow steps below:

Resolution Steps via GUI:

SSH into all three NSX Manager nodes.
From the admin user, run the following command on each node:
```
start search resync all
```
Log out of the NSX UI and then log back in.
Navigate to System > Fabric > Hosts, then check the Other Nodes and Standalone tabs. The stale transport nodes should now be visible.
For each stale host:
- Select the host.
- Choose Delete NSX > Force Delete.
Repeat this process one host at a time until all stale entries are removed.
Once all stale hosts have been deleted, run the start search resync all command again on all three NSX Manager nodes to ensure the search index is updated.
Confirm that no hosts are listed under the Other Nodes or Standalone sections in System > Fabric > Hosts.
The associated alarms should automatically clear. If any alarms persist, manually select them and click Actions > Resolve to clear them.

If you have completed the GUI option above and the host still does not appear on the NSX UI, to allow removal the following API steps can be used to remove the transport node.

Resolution Steps via API:

Run the following API call:
GET https://<NSX Mgr IP>/api/v1/transport-nodes/<UUID>/state command.

Note: Replace <UUID> with the Transport Node UUID, as found by searching for the transport node name using the NSX universal search, the name can be found in the alarm.
Replace <NSX Mgr IP> with the IP address or FQDN of an NSX manager node.
If the state value in the API response is not "Object Not found " then proceed to step 3.
Note: The state value should be object not found when the host is successfully removed.
For NSX-T 3.2.x and 4.x, run the following API call:
DELETE https://<NSX Mgr IP>/api/v1/transport-nodes/<UUID>?force=true&unprepare_host=falseNote: Replace <UUID> with the Transport Node UUID, as found by searching for the transport node name using the NSX universal search, the name can be found in the alarm.
Replace <NSX Mgr IP> with the IP address or FQDN of an NSX manager node.
If using curl to run this API, the full url must be in double quotes.
Wait five minutes, then run the GET transport node state command again, as seen in Step 1 periodically until "Object Not found" is returned.
Once GET API returns "Object Not found" the host has been successfully removed and the alarms should auto clear.

****************

This alarm can also trigger when stale NSX VIBs remain on ESXi hosts despite being decommissioned as Transport Nodes in the NSX UI.

Note: If the ESXi host/hosts are still present in vCenter and not decommissioned yet, check for existing NSX VIBs. (esxcli software vib list | grep nsx) :

Follow the steps mentioned in Uninstall NSX from a vSphere Cluster, to remove the nsx vibs.

****************

If none of the options have resolved the issue, please collect the information outlined in the Additional Information section below and open a technical support case with Broadcom Support for further investigation. and refer to this KB article.

For more information, see Creating and managing Broadcom support cases.

Additional Information

This issue may also manifest as Installing or Upgrading NSX on an ESXi host fails reporting the node already exists if these Transport Nodes were temporarily removed from vCenter and re-added.