NSX manager cluster status degraded or unavailable
search cancel

NSX manager cluster status degraded or unavailable

book

Article ID: 394089

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

NSX manager cluster may become degraded or having an unavailable status when there are services on any NSX manager nodes not starting. 

A reboot of the NSX manager node usually fixes this issue. However, in some cases, after rebooting the manager node, it may show multiple services not starting including "Cluster_Boot_Manager" service. 

An example is:

When running /etc/init.d/nsx-cluster-boot-manager prestart in the problem NSX manager as root user, the output may be similar to the following:

root@nsx-manager1 # /etc/init.d/nsx-cluster-boot-manager prestart
mkdir: cannot create directory '/config/cluster-manager/cluster-manager': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          chmod: cannot access '/config/cluster-manager/cluster-manager/private/keystore.jks': No such file or directory
Error occurred on line number: 152, in function: modify_user_and_permission.
mkdir: cannot create directory '/config/cluster-manager/api': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/ccp': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/policy': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/csm': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/mp': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/gm': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/ar': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/vmc': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/monitoring': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          chmod: cannot access '/config/cluster-manager/monitoring/private/keystore.jks': No such file or directory
Error occurred on line number: 152, in function: modify_user_and_permission.
mkdir: cannot create directory '/config/cluster-manager/idps-reporting': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/cm-inventory': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/messaging-manager': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                          mkdir: cannot create directory '/config/cluster-manager/upgrade-coordinator': File exists
 * Trust directory already exists or being generated.                                                                                                                                                                                                                                                                           * Checking the Appliance ModeType to populate the right entities into cbm.json                                                                                                                                                                                                                                                /etc/init.d/nsx-cluster-boot-manager: line 59: [: =: unary operator expected
/etc/init.d/nsx-cluster-boot-manager: line 59: [: =: unary operator expected
 * Appliance ModeType is not cloud native, retaining default cbm.json file

Look for "chmod: cannot access '/config/cluster-manager/*/private/keystore.jks': No such file or directory"

/config/cluster-manager/ar/private/
/config/cluster-manager/ccp/private/
/config/cluster-manager/cluster-manager/private/
/config/cluster-manager/cm-inventory/private/
/config/cluster-manager/idps-reporting/private/
/config/cluster-manager/messaging-manager/private/
/config/cluster-manager/monitoring/private/
/config/cluster-manager/mp/private/
/config/cluster-manager/site-manager/private/
/config/cluster-manager/upgrade-coordinator/private/

 

Environment

VMware NSX-T Datacenter

VMware NSX

Cause

Keystore.jks file missing prevents NSX_Cluster_Boot_Manager from starting which is a prerequisites for other NSX services to start. 

The exact reason for why Keystore.jks file missing is unknown. This may be caused by certificate replacement activity

Resolution

  • Run /etc/init.d/nsx-cluster-boot-manager prestart in the problem NSX manager as root user, locate the message indicating missing keystore.jks file, such as: 
    • chmod: cannot access '/config/cluster-manager/monitoring/private/keystore.jks': No such file or directory
  • Change directory to the referenced folder, such as /config/cluster-manager/monitoring/private/
  • Check if there is a keystore.jks_backup file
  • If keystore.jks_backup file exist
    • cp keystore.jks_backup keystore.jks
    • Re-run /etc/init.d/nsx-cluster-boot-manager prestart
    • If no error messages for "No such file or directory" then start the Cluster Boot Manager service as root user
    • /etc/init.d/nsx-cluster-boot-manager start
    • Reboot the NSX manager node
    • All service should be up after a few minutes and the cluster status should be back to "stable"
  • If keystore.jks_backup file does not exist
    • Please contact Broadcom Support and upload all NSX manager logs to the case (if UI is still available).
    • Alternatively, if at least one NSX manager node is still up and running and its UI is accessible, you may delete the problem node and redeploy.