API Gateway: How the rate limit assertion works
search cancel

API Gateway: How the rate limit assertion works

book

Article ID: 190066

calendar_today

Updated On:

Products

CA API Gateway API SECURITY CA API Gateway Precision API Monitoring Module for API Gateway (Layer 7) CA API Gateway Enterprise Service Manager (Layer 7) STARTER PACK-7 CA Microgateway

Issue/Introduction

How does the rate limit assertion work on the API Gateway?

Environment

Release : 9.3, 9.4

Component : API GATEWAY

Resolution

The rate limit uses a token bucket algorithm. In order to allow through a request, a counter must spend a token from the bucket. Tokens accumulate in the bucket when it goes unused, up to a maximum. The rate at which tokens are added to the bucket for a given number of nanoseconds of idle time is set by the configured rate limit.

When burst mode is not enabled, the bucket can only hold 1.5 tokens. This means that if a request is sent through an idle counter, the counter will not have enough tokens to allow through a second request until at least half the rate limit has elapsed (eg, for a limit of 10/sec, that would be half of 1/10th a sec, or 1/20th a sec). Over time this will only allow through messages at a rate equal to the limit.

With burst mode enabled, the bucket is allowed to hold more tokens. The number of tokens it can hold depends on the rate limit and the "spread limit over" setting. If you are spreading the limit over 5 seconds, then up to 5 seconds worth of tokens are allowed to accumulate in the bucket when a counter is idle. For the previously-mentioned 10/sec limit, this would be 50 tokens. In this mode, the counter will have enough tokens to spend to allow 
a burst of 50 requests arriving all at once, but after this it would be empty and (if traffic continues to arrive) would continue on to behave like it would have if burst mode was not enabled.

For concurrency, it is possible to limit concurrency per-counter using the concurrency limit. The intent of the global maxQueuedThreads setting is to prevent all Gateway transport pool threads from being delayed inside rate limit counters at the same time. If this isn't a concern you can disable this limit by setting to a very high value. However, then it could be possible for the Gateway to run out of available threads and stop responding to new requests.