Local NSX-T Managers nodes are showing in an "Unknown State" from the Global Manager UI after being replaced.
search cancel

Local NSX-T Managers nodes are showing in an "Unknown State" from the Global Manager UI after being replaced.

book

Article ID: 312602

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • You have recently tried to deactivate/detach the NSX Manager cluster.
  • You have recently replaced one or more of your Local Manager nodes.
  • Local Managers (LM) show Unknown State from Global Manager UI.
  • All services on Global Manager show as stable.
  • There is no functional impact of the Local Managers.
  • Entries similar to the below will be seen on the Local NSX Manager due to the cluster deactivation failure:
/var/log/nsx-audit.log
2023-03-31T11:57:33.744Z nsxmgr-01 NSX 17233 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="INFO" audit="true"] CMD: detach node f0ae0142-3f4a-ba76-b101-ba0abb2d7ad3 (duration: 238.572s), Operation status: CMD_EXECUTED_WITH_ERROR_RESULT

var/log/nvpapi/api_access.log
2023-03-31T12:39:21.727Z INFO admin 'POST /api/v1/cluster-manager?action=deactivate_cluster' 500 414 "" "" 380.904066

var/log/nvpapi/api_server.log
2023-03-31T12:33:00.824Z napi.root.node.cluster INFO DeactivateCluster called.
2023-03-31T12:39:21.726Z napi.root.cluster.cbm_response ERROR Error getting status for DeactivateCluster request, error_code 0, additional_info: [CBM238] Failed to deregister MPA with MP. Please make sure MP is stable, before retrying.
  • In the Local NSX Manager support bundle, the desired_state_manager.json may contain entries you indicating that the control cluster status is in an unknown state.
 
          "node_status": {
            "control_cluster_status": {
              "control_cluster_status": "UNKNOWN",
              "mgmt_connection_status": {
                "connectivity_status": "CONNECTED"
  <SNIP for readability>
            "host_msg_client_info": {
              "account_name": "cvn-ccp-e040b066-xxxx-xxxx-xxxx-900a6bfadabc" 
            },
            "mpa_msg_client_info": {
              "account_name": "cvn-mp-mpa-9ca4a62b-xxxx-xxxx-xxxx-f02bdfe441f6"
            },
            "type": "ControllerClusterRoleConfig"
          }
        },

Environment

VMware NSX-T Data Center

Cause

The cause of this issue is that the Management Plane is not stable when the Deactivate Cluster operation is initiated, this results in stale entries in the Local NSX Managers.

Resolution

This is a known issue impacting NSX-T Data Center.

Workaround:
Collect and inspect the API output for stale NSX Manager nodes

GET /api/v1/cluster/nodes/deployments
Delete the stale nodes.
DELETE /api/v1/cluster/nodes/<node-UUID>