NSX Manager Node Replacement Using Manual Deployment When Compute Manager Is Not Connected

search cancel

NSX Manager Node Replacement Using Manual Deployment When Compute Manager Is Not Connected

book

Article ID: 405482

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

An NSX Manager node in a cluster requires replacement due to failure or bad state. The compute manager is not configured with the vCenter Server where the NSX managers are deployed, preventing automatic deployment of a replacement manager through the UI.

Manual deployment and cluster join procedures are required to restore the NSX Manager cluster to a healthy state with three nodes.

Environment

VMware NSX-T Data Center
VMware NSX
VMware vCenter Server

Cause

NSX Manager node failure requires manual replacement when the compute manager lacks proper vCenter configuration. Without compute manager configuration, the automated deployment option through the NSX Manager UI is unavailable, necessitating manual deployment procedures.

Resolution

To replace a failed NSX Manager node:

Identify and detach the failed NSX Manager node
- SSH to a healthy NSX Manager node
- Run get cluster status to identify the failed node and its ID
- Execute detach node <node-id> to remove the failed manager from the cluster
- Delete the failed NSX Manager VM from vCenter
Manually deploy a replacement NSX Manager node
- Deploy a new NSX Manager OVA following the documentation: Deploy NSX Managers
- Use the same NSX version as the existing cluster members
- Configure appropriate network settings for your environment
- Ensure the new node can communicate with existing cluster members
Obtain cluster information from existing nodes
- On a healthy NSX Manager node, retrieve required information:
```
get certificate api thumbprint
get cluster config
```
- Note the cluster ID and certificate thumbprint values
Join the new NSX Manager to the existing cluster
- SSH to the newly deployed NSX Manager
- Execute the join command with information from step 3:
```
join <Manager-IP> cluster-id <cluster-id> username <Manager-username> password <Manager-password> thumbprint <Manager-thumbprint>
```
- The cluster stabilization process takes 10-15 minutes
- Monitor progress with get cluster status
- Verify all cluster services show status "UP" before making any other changes
Verify cluster health
- Check cluster status on all nodes
- Verify management plane connectivity
- Confirm all three managers are operational in the UI

For detailed cluster operations and replacement scenarios, refer to:

Important considerations:

Manual deployment requires additional time compared to automated deployment
Ensure network connectivity between all cluster members before joining
Plan for manual intervention when compute manager configuration is not available

If the error persists after following these steps, contact Broadcom Support for further assistance.

Please provide the below information when opening a support request with Broadcom for this issue:

NSX Manager cluster status output
Manager node logs from all cluster members
Details of the failed node and replacement attempts

Feedback

thumb_up Yes

thumb_down No