NSX Edge Nodes Display "Unknown" Status in UI Due to APH Certificate Validation Failure
search cancel

NSX Edge Nodes Display "Unknown" Status in UI Due to APH Certificate Validation Failure

book

Article ID: 432939

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Multiple Edge nodes associated with Tier-0 and Tier-1 Gateways are displaying an "Unknown" status within the NSX Manager UI as shown below:

  • Despite this visual error in the management plane, there is no production impact, and data plane traffic continues to flow normally. Standard troubleshooting steps yield the following observations:  
    • Data/Control Plane Check: API calls (GET /api/v1/transport-nodes/<node-id>/node/status) and Edge CLI commands (get managers, get controllers) confirm the Edge nodes are healthy, "Green," and connected to the NSX Manager cluster.
    • Service Restart: Restarting the nsx-proxy service on the affected Edge nodes does not resolve the UI status.
    • Reboot Test: A rolling reboot of the NSX Managers temporarily resolves the UI issue for a few seconds, after which the Edges revert to the "Unknown" state.
  • /var/log/proton/ logs on the NSX Manager reveals warnings similar to the following:

WARN com.vmware.nsx.management.policy.policyframework.realization.StatusTracker RealizationRpcClient 5059 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Cannot send realization status request 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'. Stub is unavailable. Node id = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', APH Endpoint = 'nsx://xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'.

Environment

VMware NSX 4.1.x

VMware NSX 4.2.x

Cause

  • The consolidated "Unknown" status in the UI is driven by an "Unknown" publish status fetched by querying the Realization Tracker Framework (RTF). During this issue, the realization status of the Logical Router shows as "Unknown" because the realization status request is not being sent to one of the Central Control Plane (CCP) nodes due to stub unavailability. This is due to APH (Appliance Proxy Hub) certificate validation is failing on the manager node because a required unique identifier is missing from the certificate.

 

Resolution

To resolve this issue, regenerate the APH-TN certificates on all the NSX Manager nodes so that proper validation can occur between them.

Refer: https://knowledge.broadcom.com/external/article?articleNumber=373270.