Rate Limiting
RateLimitLayer caps how often a single principal can hit the router. The
shipped algorithm is a per-key token bucket with configurable burst size
and refill rate. Banks use it to dampen abuse on customer-facing channels
without writing per-route guards.
Wiring
RateLimitConfig carries:
burst— maximum tokens in the bucket (the largest peak the layer accepts)refill_per_second— tokens added back per wall-clock second
(60, 1.0) lets a caller burst 60 requests, then
steady-state 1 request per second.
Request flow
For every request the layer:- derives a key from the request (default:
Authorizationheader SHA-256 fingerprint) - asks the store to consume one token
- either forwards the request and adds
X-RateLimit-Limit+X-RateLimit-Remainingheaders to the response - or returns
429 Too Many RequestswithRetry-After: <seconds>and an explanation body
Key function
The default fingerprint matches the idempotency layer’s. Banks running tenant-scoped budgeting override it:Stores
The shipped implementation isInMemoryRateLimitStore — a HashMap of
buckets behind a Mutex. It is appropriate for:
- single-replica deployments
- development and testing
- per-pod fairness in deployments where the upstream load balancer already shards by principal
RateLimitStore trait
is async and dyn-compatible — Redis-backed implementations are the typical
choice:
RateLimitDecision is either Allowed { remaining } or
Throttled { retry_after_secs }.
Choosing parameters
Practical starting points:- customer-facing read endpoints: burst 30, refill 2.0 — accommodates page-load bursts
- mutating endpoints: burst 10, refill 0.5 — same caller can do meaningful work but not script floods
- operator/back-office endpoints: burst 600, refill 10.0 — humans behind a workstation, not bots
Caveats
InMemoryRateLimitStoredoes not bound the key map. Long-running processes facing a high-cardinality key space (per-IP, per-session) should swap to a TTL-aware store.- The token bucket is wall-clock-driven; a process pause longer than one bucket-fill window grants a fresh burst on resume.
- The shipped store does not persist across restarts. That is the right choice for per-pod fairness and the wrong choice for global enforcement.
Read Next
- Idempotency for the duplicate-execution protection that pairs naturally with rate limiting
- Auth provider for the principal model the key function reads from