The Gateway's inbound HTTP(S) listeners are handled by Apache Tomcat and its component items. These instances of Tomcat are configured with a core concurrency and maximum concurrency. The core and maximum concurrency values represent the starting amount of HTTP connections that are made and the maximum amount of HTTP connections that are allowed, respectively.
Core concurrency (set as io.httpCoreConcurrency) specifies how many initial HTTP listeners are created when the Gateway starts. Having an adequate number of HTTP listeners running at initialization time is good for performance. Starting new HTTP listeners takes time and resources to facilitate and it is best to set the core concurrency for the expected level of traffic for the system in terms of concurrent connections.
Maximum concurrency (set as io.httpMaxConcurrency) specifies the maximum number of HTTP listeners that will be created. The Gateway will not allow more HTTP listeners to be created and will result in queued requests if insufficient HTTP listeners are not created. Managing existing HTTP listeners and handling queued connections takes additional time and resources to handle and CPU and RAM utilization tends to increase as more connections are queued.
It is not ideal to increase these values to astronomically large numbers as each HTTP listener requires CPU and RAM to manage and keep open. The amount of RAM and the power and quantity of CPUs allocated to a Gateway appliance will allow the Gateway application to keep more HTTP listeners open but there are a finite amount of resources to be used.
All supported versions of the API Gateway
Deciding upon the correct values for the Gateway's concurrency must be done experimentally. For reference, the concurrency values from the factory are set to avoid inundating development environments with too many concurrent requests. As such, it is not unexpected or unreasonable for those values to need to be changed in a non-production load tests or prior to a production deployment. Limit the increase of these values to 50% of the current values per load test. Change the cluster-wide properties, perform a load test, and then adjust the values by an additional 50% of their current value. Performance should gradually increase but more system resources will be used as as the concurrency increases. Ensure that monitoring of the Gateway's resources (specifically RAM and CPU) are done during the load test to determine the best possible values.