A production-grade API rate limiting service demonstrating real-world backend infrastructure design: JWT-authenticated endpoints, subscription-based request quotas, Redis-backed distributed caching, and Docker containerized deployment.
This is not a tutorial project — it solves a real infrastructure problem that every scalable API platform must address: preventing abuse while giving paying users fair, reliable access.
Key engineering decisions made:
- Token Bucket over fixed-window — chosen for burst tolerance and smoother UX compared to sliding-window or leaky-bucket approaches
- Redis over in-memory — enables horizontal scaling across multiple service instances with a single source of truth for rate counters
- Filter-chain placement — rate limiting sits after JWT authentication, ensuring anonymous traffic is rejected before consuming quota resources
- Custom annotations —
@BypassRateLimitand@PublicEndpointdecouple rate-limit logic from business logic cleanly
| Layer | Technology |
|---|---|
| Language | Java 17 |
| Framework | Spring Boot 3 |
| Security | Spring Security 6 + JWT |
| Rate Limiting | Bucket4j (Token Bucket Algorithm) |
| Cache | Redis 7 |
| Database | MySQL 8 |
| DB Migrations | Flyway |
| Testing | Testcontainers |
| Containerization | Docker + Docker Compose |
Scaling model: Multiple API instances share a single Redis cache. Rate counters are centralized — adding instances does not inflate per-user limits.
Rate limits are seeded at startup via Flyway migration scripts, keeping configuration version-controlled and reproducible.
| Plan | Requests / Hour |
|---|---|
| FREE | 20 |
| BUSINESS | 40 |
| PROFESSIONAL | 100 |
Every API response includes rate limit metadata, allowing clients to build smart retry logic:
X-Rate-Limit-Remaining: 5
X-Rate-Limit-Retry-After-Seconds: 3600Each authenticated user is assigned a Redis-backed token bucket scoped to their plan.
On each request:
tokens_remaining > 0 → consume 1 token, allow request
tokens_remaining = 0 → reject with HTTP 429
Bucket refills at the start of each time window (hourly).
Benefits over simpler counters:
- Handles bursts gracefully
- No thundering-herd on window reset
- State is atomic and distributed
Clean separation of concerns — rate limiting policy is declared at the endpoint level, not hardcoded in filters.
// Allows plan upgrades even when quota is exhausted
@BypassRateLimit
@PutMapping("/api/v1/plan")
public ResponseEntity<?> upgradePlan(...) { }
// Skips both authentication and rate limiting (e.g. health checks, login)
@PublicEndpoint
@GetMapping("/plan")
public ResponseEntity<?> getPlans(...) { }rate-limiting-api-spring-boot/
│
├── src/main/java/com/ratelimiter/
│ ├── config/
│ │ └── RedisConfiguration.java # Redis connection + Bucket4j setup
│ ├── security/
│ │ ├── SecurityConfiguration.java # Spring Security filter chain config
│ │ └── JwtAuthenticationFilter.java # Token extraction + user resolution
│ ├── filter/
│ │ └── RateLimitFilter.java # Per-request quota enforcement
│ ├── service/
│ │ └── RateLimitingService.java # Plan resolution + bucket management
│ ├── annotations/
│ │ ├── BypassRateLimit.java # Skip rate limiting on endpoint
│ │ └── PublicEndpoint.java # Skip auth + rate limiting
│ └── controllers/
│
├── resources/
│ ├── application.yml
│ └── db/migration/ # Flyway versioned SQL scripts
│
├── Dockerfile
├── docker-compose.yml
└── README.md
Prerequisites: Docker, Java 17, Maven
mvn clean packagedocker-compose up --buildThis starts:
- Spring Boot API →
localhost:8080 - Redis →
localhost:6379 - MySQL →
localhost:3306
http://localhost:8080/swagger-ui.html
Integration tests use Testcontainers — no mocking of Redis or MySQL, tests run against real containerized instances.
# Unit tests
mvn test
# Integration tests (spins up Redis + MySQL containers automatically)
mvn integration-testKey test classes:
RateLimitingServiceIT— validates bucket creation, token consumption, and plan enforcementJokeControllerIT— end-to-end HTTP tests including 429 response behavior
┌─────────────┐
│ Redis │ ← Shared rate-limit state
└──────┬──────┘
┌───────────┼───────────┐
▼ ▼ ▼
[API Instance 1] [API Instance 2] [API Instance 3]
Because all instances share Redis, you can scale horizontally without duplicating or splitting quota. Each user's bucket is globally consistent.
- API Gateway integration (Kong / AWS API Gateway)
- Per-endpoint dynamic rate limits
- Prometheus + Grafana monitoring dashboard
- Distributed tracing (OpenTelemetry)
- Circuit breaker pattern (Resilience4j)
This project demonstrates proficiency in:
- Distributed systems design — shared Redis state, horizontal scalability
- Spring internals — custom filter chain ordering, annotation-driven configuration
- Security best practices — JWT auth decoupled from business logic
- Infrastructure as code — Flyway migrations, Docker Compose, Testcontainers
- Algorithm selection — reasoned choice of Token Bucket with documented tradeoffs