NSX Global Manager UI is generating error "Server is overloaded (Error code: 98)"

Products

VMware NSX

Issue/Introduction

NSX global manager UI stops working with following error -

HTTP failure with Error: Server is overloaded (Error code: 98)","error_code":"98","module_name":"common-service","error_data":{"status":503}}}.

var/log/gmanager/gmanager-ui.log shows log similar to following -

INFO http-nio-127.0.0.1-64440-exec-42 UI_LOG #### - [nsx@6876 comp="global-manager" level="INFO" reqId="####-####-####-####-####" subcomp="global-manager" username="####@####"] {"user":"####@####","message":"Api Errors->","messageData":{"headers":{"normalizedNames":{},"lazyUpdate":null},"status":503,"statusText":"Service Unavailable","url":"https://<nsx-global-manager>/api/v1/upgrade/available-releases?source=notification","ok":false,"name":"HttpErrorResponse","message":"Http failure response for https://<nsx-global-manager>/api/v1/upgrade/available-releases?source=notification: 503 Service Unavailable","error":{"error_code":"98","module_name":"common-service","error_message":"Error: Server is overloaded (Error code: 98)","error_data":{"status":503}}},"level":"Error","browser":"####","time":"####","location":""}

/var/log/proxy/envoy.log shows concurrency limit reached.

[info][filter] [nsx_envoy_filters/source/http/peruserpernodeapilimit/per_user_per_node_api_limit.cc:41] concurrency limit reached 40 for username_or_ip Qj####==

Environment

VMware NSX

Cause

The issue happened because per_user_per_node_api_limit was hit.

Resolution

- Identify the client exceeding concurrency limit and take appropriate steps to limit API calls from that client.

- Check configured client API concurrency limit using following API or CLI -

CLI :
nsx-mngr> get service http

Service name: http
Service state: running
Logging level: info
Session timeout: 1800
Connection timeout: 30
Client API rate limit: 100 requests/sec
Client API concurrency limit: 40 connections
Global API concurrency limit: 199 connections
Redirect host: (not configured)
Basic authentication: enabled
Cookie-based authentication: enabled

API :

GET https://<nsx-mgr>/api/v1/cluster/api-service
or,
curl -k -u admin -X GET "https://<nsx-mgr>/api/v1/cluster/api-service"

This is the maximum number of outstanding requests that a client can have. For example, a client can open multiple connections to NSX and submit operations on each connection. When this limit is exceeded, the server returns a 503 Server Unavailable error to the client. By default, this limit is 40 concurrent requests.

If more connections are required than update the configured value to desired limit.

Note : Do not modify or increase the default limits unless a specific use case justifies the change. Increasing this limit will masks underlying issues, instead it is recommended to take appropriate steps to limit API calls from the offending client.

Steps to update client API concurrency limit -

CLI :
nsx-mngr-01> set service http client-api-concurrency-limit
<http-client-api-concurrency-limit> HTTP API per-client concurrency limit value in the range of 0 - 9223372036854775807

API :

https://developer.broadcom.com/xapis/nsx-t-data-center-rest-api/4.2.1/method_UpdateApiServiceConfig.html

PUT https://<nsx-mgr>/api/v1/cluster/api-service
{
"global_api_concurrency_limit": 199,
"client_api_rate_limit": 100,
"client_api_concurrency_limit": 40, <-- [Set required value here]
"connection_timeout": 30,
"redirect_host": "",
"cipher_suites": [
{"enabled": true, "name": "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"},
{"enabled": true, "name": "TLS_RSA_WITH_AES_256_GCM_SHA384"},
{"enabled": true, "name": "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"},
{"enabled": true, "name": "TLS_RSA_WITH_AES_128_GCM_SHA256"}
{"enabled": true, "name": "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384}",
{"enabled": true, "name": "TLS_RSA_WITH_AES_256_CBC_SHA256"},
{"enabled": true, "name": "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA"},
{"enabled": true, "name": "TLS_RSA_WITH_AES_256_CBC_SHA"},
{"enabled": true, "name": "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256"},
{"enabled": true, "name": "TLS_RSA_WITH_AES_128_CBC_SHA256"},
{"enabled": false, "name": "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA"},
{"enabled": false, "name": "TLS_RSA_WITH_AES_128_CBC_SHA"}
],
"protocol_versions": [
{"enabled": true, "name": "TLSv1.1"},
{"enabled": false, "name": "TLSv1.2"}
]
}

or, if you are using curl please use the following command -
curl -k -u admin -H "Content-Type: application/json" -X PUT "https://<nsx-mgr>/api/v1/cluster/api-service" -d @api-service.json

Here api-service.json is the name of file containing body.

Additional Information

NSX-T API configuration for HTTP on NSX-T Manager