Reflections
After this lesson, you should be able to understand and explain why we implemented rate limiting in certain ways.
Rate limiting is a complex topic with many trade-offs. The following sections outline some of the key design decisions made in the example codebase, along with the rationale behind them.
Why were certain things implemented the way they were?
Apply Rate Limiting at the Middleware Layer
Rate limiting should be applied as early as possible in the request lifecycle. In the example codebase, rate limiting is implemented as tRPC middleware:
const rateLimitMiddleware = t.middleware(async ({ ctx, next, meta }) => {
const rateLimitOptions =
meta?.rateLimitOptions === undefined ? {} : meta.rateLimitOptions;
// Allow procedures to opt-out of rate limiting
if (rateLimitOptions === null) {
return next();
}
// Skip rate limiting in tests
if (env.NODE_ENV === "test") {
return next();
}
await checkRateLimit({
key: createRateLimitFingerprint({
ipAddress: extractIpAddress(ctx.headers),
userId: ctx.session.userId,
}),
options: rateLimitOptions,
});
return next();
});
// Apply to all procedures by default
const defaultProcedure = t.procedure
.use(timingMiddleware)
.use(rateLimitMiddleware);
Why Middleware?
- Defense in depth: Requests are rejected before reaching business logic
- Centralized configuration: One place to manage rate limiting rules
- Consistent enforcement: All procedures are protected by default
- Easy overrides: Individual procedures can customize or disable limits
Opt-in vs Opt-out
The example uses an opt-out approach — rate limiting is enabled by default:
interface Meta {
// Rate limit options for this procedure. If null, rate limiting is disabled.
// Defaults to empty object, which applies default rate limiting.
rateLimitOptions?: RateLimiterConfig | null;
}
This is safer than opt-in because:
- New procedures are automatically protected
- Developers must explicitly disable protection
- Reduces the chance of forgetting to add rate limiting
Using Composite Keys for Identification
The rate limit key determines "who" is being rate limited. The example codebase uses a composite key strategy:
export const createRateLimitFingerprint = ({
userId,
ipAddress,
}: {
userId: string | undefined;
ipAddress: string | null;
}) => {
// Prefer user ID for authenticated users
if (userId) {
return `userId:${userId}`;
}
// Fall back to IP for unauthenticated requests
return `ip:${ipAddress?.replaceAll(":", "_") ?? "unknown"}`;
};
Key Strategy Considerations
| Strategy | Pros | Cons | Use Case |
|---|---|---|---|
| IP Address Only | Works for all requests | Shared IPs (offices, NAT) affect all users | Public APIs, unauthenticated endpoints |
| User ID Only | Accurate per-user limiting | Doesn't work before authentication | Authenticated endpoints |
| Composite (IP + User) | Best of both worlds | Slightly more complex | General purpose (recommended) |
| API Key | Clean separation per consumer | Requires API key infrastructure | Third-party integrations |
Handling IPv6
Note the IPv6 handling in the example — colons are replaced with underscores:
return `ip:${ipAddress?.replaceAll(":", "_") ?? "unknown"}`;
This prevents issues with Redis key naming conventions that use colons as namespace separators.
Implement Bursty Rate Limiting
Real user behavior is bursty — a user might rapidly click through several pages, then pause to read. The example implements a dual-bucket approach:
/**
* Defaults to 2 requests per second
* and allows bursts of up to 5 requests per 10 seconds.
*
* This should be enough to allow normal usage.
*/
const defaultConfig: Required<RateLimiterConfig> = {
points: 2,
duration: 10, // Sustained rate: 2 requests per 10 seconds
burstPoints: 5,
burstDuration: 10, // Burst allowance: 5 requests per 10 seconds
keyPrefix: "app",
};
How Bursty Limiting Works
Normal Rate Limiter: Burst Rate Limiter:
┌──────────────────┐ ┌──────────────────┐
│ 2 points / 10s │ │ 5 points / 10s │
└────────┬─────────┘ └────────┬─────────┘
│ │
└───────────┬────────────┘
▼
BurstyRateLimiter
│
Request must pass BOTH limiters to succeed
This allows:
- Normal usage: Up to 2 requests per 10 seconds sustained
- Quick bursts: Up to 5 requests in a short period
- Protection: Prevents sustained high-volume abuse
When to Adjust Burst Settings
| Scenario | Recommended Adjustment |
|---|---|
| Form submissions | Lower burst (prevent double-submits) |
| Page navigation | Higher burst (users click around) |
| Search/autocomplete | Higher sustained rate (frequent requests) |
| File uploads | Lower rate, custom per-endpoint |
| Admin operations | Stricter limits (sensitive actions) |
Use Redis for Distributed Rate Limiting
In a multi-server deployment, in-memory rate limiting fails because each server has its own counter:
Without Redis:
┌─────────────┐ ┌─────────────┐
│ Server A │ │ Server B │
│ Count: 5/10 │ │ Count: 5/10 │
└─────────────┘ └─────────────┘
↑ ↑
└───── Same user ───┘
User makes 10 requests total, but neither server blocks them!
The example codebase uses Redis with an in-memory fallback:
const createRateLimiter = (config: RateLimiterConfig): BurstyRateLimiter => {
// In-memory fallback limiter
const memoryLimiter = new RateLimiterMemory({
points: mergedConfig.points,
duration: mergedConfig.duration,
});
// If no Redis, use memory-only limiter
if (!redis) {
const memoryBurstLimiter = new RateLimiterMemory({
points: mergedConfig.burstPoints,
duration: mergedConfig.burstDuration,
});
return new BurstyRateLimiter(memoryLimiter, memoryBurstLimiter);
}
// Redis-backed limiter with memory fallback
return new BurstyRateLimiter(
new RateLimiterRedis({
storeClient: redis,
rejectIfRedisNotReady: true,
points: mergedConfig.points,
duration: mergedConfig.duration,
keyPrefix: `${RATE_LIMIT_NAMESPACE_KEY}${mergedConfig.keyPrefix}:`,
insuranceLimiter: memoryLimiter, // Fallback if Redis fails
})
// ... burst limiter config
);
};
Why Insurance Limiters?
The insuranceLimiter option provides resilience:
- Redis goes down: Memory limiter takes over, limiting still works
- Redis latency spike: Requests aren't blocked waiting for Redis
- Development mode: Can work without Redis infrastructure (though there is a local Docker Redis setup)
Insurance limiters are per-server, so protection is degraded (not coordinated) when Redis is unavailable. Monitor Redis health in production.
Use Appropriate Error Responses
When a rate limit is exceeded, provide clear, actionable feedback:
export class TRPCRateLimitError extends TRPCError {
retryAfterSeconds: number;
constructor({
message,
retryAfterSeconds,
}: {
message?: string;
retryAfterSeconds: number;
}) {
super({
code: "TOO_MANY_REQUESTS",
message:
message ??
`Rate limit exceeded. Please try again in ${retryAfterSeconds} seconds.`,
cause: {
retryAfterSeconds,
},
});
this.retryAfterSeconds = retryAfterSeconds;
}
}
Best Practices for Error Responses
- Use HTTP 429 (Too Many Requests): The standard status code for rate limiting
- Include
Retry-Afterheader: Tell clients when they can retry - Provide human-readable message: Help users understand what happened
- Log rate limit events: Track abuse patterns for analysis
Frontend Handling
On the frontend, handle rate limit errors gracefully:
// Example: React Query error handling
const mutation = useMutation({
mutationFn: submitForm,
onError: (error) => {
if (error.data?.code === "TOO_MANY_REQUESTS") {
const retryAfter = error.cause?.retryAfterSeconds ?? 60;
toast.error(`Too many attempts. Please wait ${retryAfter} seconds.`);
return;
}
toast.error("Something went wrong. Please try again.");
},
});
Allow for Per-Endpoint Customization
Different endpoints have different risk profiles. The example allows per-procedure configuration via tRPC meta:
// High-risk endpoint: Stricter limits
login: publicProcedure
.meta({
rateLimitOptions: {
points: 5,
duration: 60, // 5 attempts per minute
keyPrefix: "auth:login",
},
})
.mutation(/* ... */);
// Low-risk read endpoint: More lenient
getPublicData: publicProcedure
.meta({
rateLimitOptions: {
points: 100,
duration: 60, // 100 requests per minute
keyPrefix: "public:data",
},
})
.query(/* ... */);
// Internal/trusted endpoint: Disable rate limiting
healthCheck: publicProcedure.meta({ rateLimitOptions: null }).query(/* ... */);
Recommended Limits by Endpoint Type
| Endpoint Type | Suggested Limit | Rationale |
|---|---|---|
| Login/Auth | 5-10 / minute | Prevent brute force |
| OTP Request | 3-5 / 10 minutes | Prevent email spam |
| Form Submission | 10-20 / minute | Allow normal usage |
| API Read | 100-1000 / minute | Higher volume expected |
| File Upload | 5-10 / minute | Resource intensive |
| Admin Actions | 20-50 / minute | Sensitive, but trusted users |
Namespace Rate Limit Keys
Use prefixes to separate different rate limiters and enable easier debugging:
const defaultConfig: Required<RateLimiterConfig> = {
// ...
keyPrefix: "app",
};
// Results in Redis keys like:
// rate-limit:app:userId:abc123
// rate-limit:app:ip:192_168_1_1
// rate-limit-burst:app:userId:abc123
Benefits of Namespacing
- Isolation: Different features don't interfere with each other
- Debugging: Easy to find and inspect keys in Redis
- Cleanup: Can delete all keys for a specific feature
- Monitoring: Track usage patterns per feature
You should not namespace too aggressively, as this may allow attackers to spam many different endpoints and add pressure to your stack. Use logical groupings (e.g., auth, api, upload) to balance clarity and efficiency.
Example Namespace Structure
rate-limit:
├── app:
│ ├── userId:abc123
│ ├── ip:192_168_1_1
│ └── ip:192_168_1_2
├── auth:
│ ├── userId:abc123
│ └── userId:abc234
└── upload:
└── userId:abc123
Best Practice 8: Consider Different Rate Limiting Scopes
Rate limiting can be applied at multiple levels:
1. Application Level (Default)
Rate limit all requests from a user/IP across the entire application.
2. Endpoint Level
Different limits for different endpoints (as shown above).
3. Resource Level
Limit operations on specific resources:
// Limit comments per thread
const threadRateLimitKey = `thread:${threadId}:comments:${userId}`;
await checkRateLimit({
key: threadRateLimitKey,
options: { points: 10, duration: 60 },
});
4. Global Level
Protect against distributed attacks:
// Global limit across all users
await checkRateLimit({
key: "global:api",
options: { points: 10000, duration: 60 },
});
Combining Scopes
For critical endpoints, combine multiple scopes:
// Example: OTP request endpoint
const sendOtp = async (email: string, ipAddress: string) => {
// 1. Global limit (protect email infrastructure)
await checkRateLimit({
key: "global:otp",
options: { points: 1000, duration: 60 },
});
// 2. Per-IP limit (prevent single source abuse)
await checkRateLimit({
key: `otp:ip:${ipAddress}`,
options: { points: 5, duration: 300 },
});
// 3. Per-email limit (prevent harassment)
await checkRateLimit({
key: `otp:email:${email}`,
options: { points: 3, duration: 600 },
});
// All checks passed, send OTP
await sendOtpEmail(email);
};
Best Practice 9: Monitor and Alert on Rate Limiting
Rate limiting isn't just about blocking — it's also a signal:
What to Monitor
| Metric | Why It Matters |
|---|---|
| Rate limit hits | High counts may indicate an attack |
| Unique IPs blocked | Distributed attacks or shared networks |
| User IDs blocked | Compromised accounts or confused users |
| Block rate trends | Increasing blocks may indicate new attack patterns |
Logging Rate Limit Events
export const checkRateLimit = async ({ key, options }) => {
try {
return await limiter.consume(key);
} catch (error) {
if (error instanceof RateLimiterRes) {
// Log for monitoring
console.warn(
`Rate limit exceeded: key=${key}, retryAfter=${error.msBeforeNext}ms`
);
// Could also send to monitoring service
// metrics.increment('rate_limit.exceeded', { key_prefix: options.keyPrefix })
throw new TRPCRateLimitError({
retryAfterSeconds: Math.ceil(error.msBeforeNext / 1000),
});
}
throw error;
}
};
Alerting Thresholds
Set up alerts for:
- Spike in rate limit hits: Possible attack in progress
- Single IP hitting limits repeatedly: Targeted abuse
- Many users hitting limits: May indicate limits are too strict
- Redis errors: Rate limiting may be degraded
Best Practice 10: Test Rate Limiting Behavior
The example disables rate limiting in tests:
if (env.NODE_ENV === "test") {
return next();
}
Testing Strategies
-
Unit Tests: Test rate limit logic in isolation
it("should throw after exceeding limit", async () => {
for (let i = 0; i < 5; i++) {
await checkRateLimit({
key: "test",
options: { points: 5, duration: 60 },
});
}
await expect(
checkRateLimit({ key: "test", options: { points: 5, duration: 60 } })
).rejects.toThrow(TRPCRateLimitError);
}); -
Integration Tests: Test with Redis (use separate test database)
beforeEach(async () => {
await redis.flushdb(); // Clear rate limit state
}); -
Load Tests: Verify limits hold under stress
# Using k6 or similar
k6 run --vus 100 --duration 60s loadtest.js
Common Pitfalls
1. Rate Limiting After Business Logic
❌ Wrong:
// BAD: Rate limit checked too late
const createThread = async (input) => {
const thread = await db.thread.create({ data: input }); // Work already done!
await checkRateLimit({ key: userId }); // Too late to prevent abuse
return thread;
};
✅ Correct:
// GOOD: Rate limit checked first
const createThread = async (input) => {
await checkRateLimit({ key: userId }); // Block abusive requests early
const thread = await db.thread.create({ data: input });
return thread;
};
2. Trusting Client-Provided Identifiers
❌ Wrong:
// BAD: Client can spoof their identifier
const key = req.body.userId; // Never trust client input for rate limiting!
✅ Correct: Deriving Identifiers Server-Side
// GOOD: Use server-verified identity
const key = ctx.session.userId ?? extractIpAddress(ctx.headers);
3. Single Point of Rate Limiting
❌ Wrong:
// BAD: Only rate limiting login attempts
// Attacker can still spam OTP requests, password resets, etc.
✅ Correct: Defense in Depth
// GOOD: Rate limit at multiple levels
// - Global application limit
// - Per-endpoint limits for sensitive operations
// - Per-resource limits where appropriate
Summary
Rate limiting is essential for building secure, reliable applications. Key takeaways:
| Best Practice | Why It Matters |
|---|---|
| Apply at middleware layer | Reject abusive requests before business logic |
| Use composite keys | Identify users accurately across auth states |
| Implement bursty limiting | Allow natural usage patterns while blocking abuse |
| Use Redis for distributed systems | Ensure consistent limits across servers |
| Provide clear error responses | Help users understand and recover |
| Customize per endpoint | Match limits to risk profile |
| Namespace keys | Enable debugging and monitoring |
| Apply multiple scopes | Defense in depth |
| Monitor and alert | Detect and respond to attacks |
| Test thoroughly | Verify limits work as intended |
Remember: Rate limiting is just one layer of defense. Combine it with authentication, authorization, input validation, and other security measures for comprehensive protection.