NSX-T DFW Policy/Rule status is Unknown due to duplicate host entry
search cancel

NSX-T DFW Policy/Rule status is Unknown due to duplicate host entry

book

Article ID: 322554

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • In the NSX-T Manager's UI, under Security - Distributed Firewall, the status of a published policy or rule is 'Unknown'.
  • Clicking on the 'Unknown' status, shows the status of the policy or rule for the relevant hosts.  Here we observe a duplicate entry (one host is listed twice), one has a status of 'Success' and the other has a status of 'Unknown':
image.png
  • As admin run the following command on the NSX-T Manager to list all the transport nodes, here we can see the same host listed twice with different UUID's:

NSXMANAGER01> get nodes
UUID                    Type  Display Name
<Transport-Node-UUID-1>   esx   esx-01.corp.local
<Transport-Node-UUID-2>   esx   esx-01.corp.local
  • As root on the NSX-T manager, when we run the following API call for each of the UUID's discovered in the previous command, we see one with status Up and one with status Unknown:
NSXMANAGER01# curl -k -u 'admin' -H "Content-Type: application/xml" -X GET https://localhost/api/v1/transport-nodes/<Transport-Node-UUID>/status
  • Result for Transport-Node-UUID-1:
"node_uuid" : "<Transport-Node-UUID-1>",
"node_display_name" : "esx-01.corp.local",
"status" : "UP",
...
"status" : "UP",
...
  • Result for Transport-Node-UUID-2:
"node_uuid" : "<Transport-Node-UUID-2>",
"node_display_name" : "esx-01.corp.local",
"status" : "UNKNOWN",
...
"status" : "UNKNOWN",
...
  • As root if you try to use the API to delete the host with a status of 'Unknown' , you receive the following response: 
NSXMANAGER01# curl -k -u 'admin' -H "Content-Type: application/xml" -X DELETE https://localhost/api/v1/transport-nodes/<Transport-Node-UUID>
"httpStatus" : "BAD_REQUEST",
"error_code" : 9411,
"module_name" : "NsxSwitching service",
"error_message" : "Cannot delete a transport node <Transport-Node-UUID> which is part of Auto-TN compute collection f9412a40-5b8b-4d37-9ddd-d63d21c9d0e0:domain-c93. Please use the object name or UUID in Global Search to find all linked objects."​​​​​
  • Log entries similar to the below may be encountered in the NSX Manager syslogs:
2023-03-09T10:52:16.138+01:00 nsxmanager01 NSX 6075 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" reqId="16a63b47-6c1f-4be4-9aed-28938ab21484" subcomp="manager" username="[email protected]"] MPA on <Inactive-Transport-Node-UUID> is not connected.
...
2023-03-09T10:52:16.143+01:00 nsxmanager01 NSX 6075 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" reqId="16a63b47-6c1f-4be4-9aed-28938ab21484" subcomp="manager" username="[email protected]"] Got deployment status HOST_DISCONNECTED for node <Inactive-Transport-Node-UUID>


Environment

VMware NSX-T
VMware NSX-T Data Center

Cause

This issue occurs when the host which was removed previously, was not cleanly uninstalled and left some stale entries in NSX-T.

Resolution

This is a known issue impacting NSX-T data center.


Workaround:
IMPORTANT
The below workaround is not applicable if the cluster where the 'duplicate host' resides is prepared using vSphere Lifecycle Management (vLCM) or has service insertion deployed in it or is an NSX-T Security Only installed cluster. 
It is not possible to detach the transport node profile (TNP) from such clusters.

If your NSX configuration is not compatible with the below workaround please contact VMware Technical Support and reference this KB article (91869).


1. Detach the transport node profile from the cluster with the 'duplicate host' :
  • In NSX-T Manager UI, under System - Fabric - Hosts, click on the 'Clusters' tab.
  • Select the checkbox for the cluster we want to detach the TNP from.
  • Click Actions - Detach Transport Node Profile
image.png

2. As root, use the following Delete API call to remove the Unknown/Duplicate host:
NSXMANAGER01# curl -k -u 'admin' -H "Content-Type: application/xml" -X DELETE https://localhost/api/v1/transport-nodes/<Transport-Node-UUID>

3. Re-apply the TNP to the cluster:
  • In NSX-T Manager UI under System - Fabric - Hosts, click on the 'Clusters' tab
  • Select the checkbox next to the cluster.
  • Click Configure NSX and select the transport node profile removed in Step 1 above.
Note:  If a different TNP is selected, the host may be reconfigured, which could impact the dataplane.  To prevent host reconfiguration ensure that the same TNP that was removed in Step 1 is selected.

4. Confirm in the DFW that the status of the published policy or rule now shows 'Success' and that only a single host entry exists for the 'duplicate host':
  • In NSX-T Manager UI under Security - Distributed Firewall 
  • Clicking 'Success' shows that the 'duplicate host' now only has a single entry and the status of this entry is 'Success''.