Let’s implement rate-limiting protection for your Spring Boot server without the need for any additional dependencies beyond those included in the Spring Boot Starter package.
Rate limiting is an architectural tactic for a server to limit access to an API. It helps to:
protect against server overload due to clients that call the server in a short time frame too often
increase the fairness of how clients use server resources
allow pricing schemes for different amounts of requests
Spring Boot 3.0 does not provide rate limiting out of the box. In most cases, for you as an application developer, it might not be necessary to take care of rate-limiting as this might be addressed through your infrastructure, e.g. if you use a reverse proxy (like HA-Proxy). The infrastructure might provide a global rate limit, while rate limiting at the application level provides just a local rate limit. So, if you have n instances and each has its own rate limiting then your actual limit might be n times higher. You may check this blog post if you have multiple instances of your server.
However, adding rate limit protection to your application allows fine control per HTTP endpoint under the control of developers if the infrastructure is maintained by someone else. If you are looking for a small solution without additional dependencies for a single instance application, the solution suggested by this blog post might be an option.
We define an annotation that you can add to any HTTP endpoint that should have a rate limit protection. We define an aspect for methods with that annotation that counts HTTP requests per sender IP address. If the rate limit is exceeded we throw an exception. In the exception handling we return an HTTP Status code of 429. The rate limit configuration will be possible using properties in the Spring configuration file.
Let’s assume we have a Spring Boot Server, you have added the spring-boot-starter-aop 2.x or higher dependency and you have an endpoint similar like this one:
Let’s assume, MyResponse, and MyRequest are simple JavaBeans. Spring Boot will use the ObjectMapper of Jackson to translate the incoming JSON to an instance of MyRequest. After the call MyResponse will be translated to JSON again. You can omit the @Valid annotation if you are not using Spring Boot Bean Validation.
We start by writing an annotation @WithRateLimitProtection that allows to mark HTTP endpoints like the above processRequest one that should have a rate limit protection. We define the annotation like this:
Now, we can add this annotation to the controller that should have a rate limit protection:
If the rate limit is exceeded at the endpoint, a RateLimitException should be thrown that we define like this:
where ApiErrorMessage will be translated to a JSON body in the response, such that our JSON API answers with JSON also in case of error and not with the Spring default, i.e. an HTML page:
Let’s define an aspect that implements the rate limiting using Spring AOP. The aspect is called before the marked endpoint method is called:
The @Before annotation, aka advice, informs AspectJ to call this method before the WithRateLimitProtection annotated method is executed. We use Spring’s RequestContextHolder to fetch information on the request that called the annotated endpoint. We are especially interested in the remote address and we obtain the current time in milliseconds from the system. Per remote address, we maintain a list of current times that we store in a ConcurrentHashMap in RAM. In my use case, there was only a limited group of calling remote addresses, and time stamps that are older than rateDuration are deleted through cleanUpRequestCounts in each call so that we are not running out of memory here. If requestCounts contains for the current remote address more entries than rateLimit we throw a RateLimitException with information who has violated the rate limit. To configure the rate limit, you add in your application.properties or application.yml:
So we allow up to 200 calls per minute from the same remote address. As usual with Spring Boot, you can override the values using also system environment variable, e.g. export APP_RATE_LIMIT=100. Moreover, we have hard-coded a default in the Java code if those properties have not been specified.
So far, having written the code above, a RateLimitException will be thrown if there are too many requests. We need to add exception handling in Spring that translates the thrown RateLimitException to an HTTP 429 response with some information on the error in its response body. For that, we write an exception handler:
This blog post has shown how you limit requests per remote address for a Spring Boot server using just Spring AOP. If your requirements do not allow you to use the small simple solution above and you can add dependencies, then you may check out Resilience4J as an alternative. Alternatively, if you have multiple instances of your server, it might be interesting also to read this blog post, which describes a solution with bucket4j for distributed rate limiting.