Digital APIs are all the rage these days. But, at the end of the day, an API is nothing but a bunch of servers running code. And if there’s one thing that servers are known to do, it’s crash.
An API platform needs rate limiting to protect itself against this common issue. Rate limiting is a technique used in software to control the number of requests that are being made per second in order to prevent overloading and possible crashing. It’s also necessary for open API platforms where anyone can build on top of your data sources.
Here are some tips on designing successful rate-limiting implementations for your API platform.

Contents
What is Rate Limiting?
API rate limiting is a technical term that prevents an API from being overused. It’s important for API providers to have some form of rate limiting in place because it helps ensure that regular users are not neglected due to the usage of programs or scripts.
With the increase in API use, an important consideration is how to make sure you’re not overloading your servers. For example, if you had a limit of 100 requests per hour, and you sent 101 requests in 45 minutes, then your next 100 requests would not be processed until after 1 hour. This is an important concept for API development because it ensures that APIs are available to users at all times.
There are several reasons why organisations implement rate-limiting in an open API platform. Key among them include;
- Managing quotas and policies: Open API platforms provide services to multiple users at the same time. Rate limiting is used to make sure that each of the users has a reasonable and fair use of the service.
- Prevents starvation of resources: Sometimes, resource starvation is not only caused by attacks to the API platform, but by lack of proper rate-limiting techniques. Rate limiting improves the availability of resources by making sure that they are readily available for consumption.
- Limiting expenditure: There might exist a resource in the API platform whose continued usage or usage beyond a certain limit can get costly. Rate limiting is used to ensure that this limit is not exceeded.
- Controlling flow: Rate limiting can be used to control the flow of work in an API platform. For instance, it can control the flow of work between different workers to make sure that it is evenly distributed.
Rate Limiting Techniques
#1 Token Bucket
The token bucket technique keeps an accumulating and rolling budget of the API usage as a tokens’ balance. It knows that the number of requests made to an API does not correspond to the number of inputs to a service. Therefore, it adds tokens to the service at certain rates.
When a request is made to an API, the service decrements the count of the available tokens by one, to make sure that the request is fulfilled. In a situation where there are no more tokens available, the API has reached its limit.
#2 Fixed Window
The fixed window limiting technique is one of the easiest to use but can bring complications if one is not careful. This is because the limits can be subjected to spikes at some points depending on how the calls are made.
For instance, let us assume that you have set a fixed window rate limiting technique of ten thousand requests per hour. This means that the calls can be made anytime throughout the hour. If all the calls were to be made within the tenth minute of the hour, then the service would be overwhelmed.
#3 Leaky Bucket
The leaky bucket technique works in a similar way as the token bucket technique. However, it limits its rate using the amount that leaks or drips from the bucket. The leaky bucket technique knows that a system is capable of holding a request until the time it can be acted on by a service but extra requests are discarded.
#4 Sliding Window
The sliding window rate limiting technique works in a similar way as the fixed window technique. However, this technique is much better since the calls made are spread across the time allocated for the maximum number of calls. For instance, if you have a limit of ten calls every ten minutes, you can have a single call being processed every single minute.
Eleggible’s Final Words
In conclusion, organisations need to understand how their API platforms work in order for them to implement the right rate limiting technique that meets their requirements. This is important in making sure that the platforms are available for consumption at all times.