VMware NSX Cluster status shows as UNAVAILABLE in the UI, but not in the CLI
search cancel

VMware NSX Cluster status shows as UNAVAILABLE in the UI, but not in the CLI

book

Article ID: 322435

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You recently upgraded from NSX-T 3.2.x to VMware NSX 4.1.x and the Cluster status is showing UNAVAILABLE under System - Appliances.
cluster_status.jpg
  • However, as admin user, in the Cli, when running get cluster status, the cluster and all components up and stable.
  • Rebooting the VMware NSX Manager nodes, clearing browser cache or changing browser does not resolve the issue.
  • As root user on the VMware NSX manager in the log /var/log/proton/nsxapi.log we see the below:

2023-05-01T08:37:22.725Z ERROR http-nio-127.0.0.1-7440-exec-154 ClusterManagerUtil 4912 - [nsx@6876 comp="nsx-manager" errorCode="MP2101" level="ERROR" reqId="xxxxxxxxxx" subcomp="manager" username="xxxxxx"] Request GET http://localhost:7989/api/v1/cluster-manager/status HTTP/1.1 failed, return code is 400
2023-05-01T08:37:22.725Z ERROR http-nio-127.0.0.1-7440-exec-154 ClusterManagerUtil 4912 - [nsx@6876 comp="nsx-manager" errorCode="MP2121" level="ERROR" reqId="xxxxxxxxxxxxxxx" subcomp="manager" username="xxxxx"] Cluster status retrieved from cluster manager is empty

  • As root user on the VMware NSX manager in the log /var/log/cbm/cbm.log we see the below:

127.0.0.1 - - [01/May/2023:08:37:23 +0000] "GET /api/v1/cluster-manager/status HTTP/1.1" 400 435 127.0.0.1 - - [01/May/2023:08:38:44 +0000] "GET /api/v1/cluster-manager/status HTTP/1.1" 200 25670

  • As root user on the VMware NSX manager in the log /var/log/cbm/tanuki.log we see the below:

INFO  | jvm 1  | 2023/03/11 12:50:50 | INFO: Error parsing HTTP request header INFO  | jvm 1  | 2023/03/11 12:50:50 | Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level. INFO  | jvm 1  | 2023/03/11 12:50:50 | java.lang.IllegalArgumentException: Request header is too large INFO  | jvm 1  |


Environment

VMware NSX 4.1.0

Cause

This issue occurs when the Request header is to large and the Cluster Boot Manager (CBM) is unable to process it.

Resolution

This is a known issue impacting VMware NSX.

Workaround:

Please make sure you have a backup in place before proceeding.

Repeat the below steps 1 to 4 on each manager node as root user.
1.
  • Copy the original file, as a backup, before editing:

mkdir /root/jarFileBackup

cp /opt/vmware/cbm/cbm-app/libweb-server-cbm.jar /root/jarFileBackup/

  • Ensure that libweb-server-cbm.jar is copied into jarFileBackup folder:
ls -l /root/jarFileBackup/

2.

  • In order to edit the file, we will copy it again to another directory and unzip it: 

mkdir /root/jarFileUpdate

cp /opt/vmware/cbm/cbm-app/libweb-server-cbm.jar /root/jarFileUpdate/

cd /root/jarFileUpdate

unzip libweb-server-cbm.jar

  • Check the contents of the file, making sure max http header size entry application.properties is not present: 
cat application.properties
  • Append the value server.max-http-header-size=32KB to the file application.properties with below echo command : 
echo "server.max-http-header-size=32KB" >> application.properties
  • Verify the value has be added to the end of the file:
cat application.properties
  • Remove the old jar without header size info:
rm libweb-server-cbm.jar
  • Create new jar file with edited file and other original files:
zip -r libweb-server-cbm.jar *
  • Copy the new jar file to original location:
cp libweb-server-cbm.jar /opt/vmware/cbm/cbm-app/

3.

  • Restart the CBM service to use the updated jar with increased header limit using below command 
/etc/init.d/nsx-cluster-boot-manager restart

NOTE: This step may incur some cluster downtime, due to the CBM service restarting and last until the service is backup and cluster is healthy again.

4.

  • As admin user, check the cluster is healthy and all services are running and once cluster is up and healthy again, repeat steps 1 to 4 on the remaining manager nodes:

get cluster status

5.  The VMware NSX UI should now be showing the correct value.