Reflections

After this lesson, you should be able to understand and explain why we implemented rate limiting in certain ways.

Rate limiting is a complex topic with many trade-offs. The following sections outline some of the key design decisions made in the example codebase, along with the rationale behind them.

Why were certain things implemented the way they were?

Apply Rate Limiting at the Middleware Layer

Rate limiting should be applied as early as possible in the request lifecycle. In the example codebase, rate limiting is implemented as tRPC middleware:

apps/web/src/server/api/trpc.ts
const rateLimitMiddleware = t.middleware(async ({ ctx, next, meta }) => {
  const rateLimitOptions =
    meta?.rateLimitOptions === undefined ? {} : meta.rateLimitOptions;

  // Allow procedures to opt-out of rate limiting
  if (rateLimitOptions === null) {
    return next();
  }

  // Skip rate limiting in tests
  if (env.NODE_ENV === "test") {
    return next();
  }

  await checkRateLimit({
    key: createRateLimitFingerprint({
      ipAddress: extractIpAddress(ctx.headers),
      userId: ctx.session.userId,
    }),
    options: rateLimitOptions,
  });

  return next();
});

// Apply to all procedures by default
const defaultProcedure = t.procedure
  .use(timingMiddleware)
  .use(rateLimitMiddleware);

Why Middleware?

Defense in depth: Requests are rejected before reaching business logic
Centralized configuration: One place to manage rate limiting rules
Consistent enforcement: All procedures are protected by default
Easy overrides: Individual procedures can customize or disable limits

Opt-in vs Opt-out

The example uses an opt-out approach — rate limiting is enabled by default:

interface Meta {
  // Rate limit options for this procedure. If null, rate limiting is disabled.
  // Defaults to empty object, which applies default rate limiting.
  rateLimitOptions?: RateLimiterConfig | null;
}

This is safer than opt-in because:

New procedures are automatically protected
Developers must explicitly disable protection
Reduces the chance of forgetting to add rate limiting

Using Composite Keys for Identification

The rate limit key determines "who" is being rate limited. The example codebase uses a composite key strategy:

apps/web/src/server/modules/rate-limit/rate-limit.service.ts
export const createRateLimitFingerprint = ({
  userId,
  ipAddress,
}: {
  userId: string | undefined;
  ipAddress: string | null;
}) => {
  // Prefer user ID for authenticated users
  if (userId) {
    return `userId:${userId}`;
  }
  // Fall back to IP for unauthenticated requests
  return `ip:${ipAddress?.replaceAll(":", "_") ?? "unknown"}`;
};

Key Strategy Considerations

Strategy	Pros	Cons	Use Case
IP Address Only	Works for all requests	Shared IPs (offices, NAT) affect all users	Public APIs, unauthenticated endpoints
User ID Only	Accurate per-user limiting	Doesn't work before authentication	Authenticated endpoints
Composite (IP + User)	Best of both worlds	Slightly more complex	General purpose (recommended)
API Key	Clean separation per consumer	Requires API key infrastructure	Third-party integrations

Handling IPv6

Note the IPv6 handling in the example — colons are replaced with underscores:

return `ip:${ipAddress?.replaceAll(":", "_") ?? "unknown"}`;

This prevents issues with Redis key naming conventions that use colons as namespace separators.

Implement Bursty Rate Limiting

Real user behavior is bursty — a user might rapidly click through several pages, then pause to read. The example implements a dual-bucket approach:

apps/web/src/server/modules/rate-limit/rate-limit.service.ts
/**
 * Defaults to 2 requests per second
 * and allows bursts of up to 5 requests per 10 seconds.
 *
 * This should be enough to allow normal usage.
 */
const defaultConfig: Required<RateLimiterConfig> = {
  points: 2,
  duration: 10, // Sustained rate: 2 requests per 10 seconds
  burstPoints: 5,
  burstDuration: 10, // Burst allowance: 5 requests per 10 seconds
  keyPrefix: "app",
};

How Bursty Limiting Works

Normal Rate Limiter:     Burst Rate Limiter:
┌──────────────────┐     ┌──────────────────┐
│ 2 points / 10s   │     │ 5 points / 10s   │
└────────┬─────────┘     └────────┬─────────┘
         │                        │
         └───────────┬────────────┘
                     ▼
              BurstyRateLimiter
                     │
    Request must pass BOTH limiters to succeed

This allows:

Normal usage: Up to 2 requests per 10 seconds sustained
Quick bursts: Up to 5 requests in a short period
Protection: Prevents sustained high-volume abuse

When to Adjust Burst Settings

Scenario	Recommended Adjustment
Form submissions	Lower burst (prevent double-submits)
Page navigation	Higher burst (users click around)
Search/autocomplete	Higher sustained rate (frequent requests)
File uploads	Lower rate, custom per-endpoint
Admin operations	Stricter limits (sensitive actions)

Use Redis for Distributed Rate Limiting

In a multi-server deployment, in-memory rate limiting fails because each server has its own counter:

Without Redis:
┌─────────────┐     ┌─────────────┐
│  Server A   │     │  Server B   │
│ Count: 5/10 │     │ Count: 5/10 │
└─────────────┘     └─────────────┘
      ↑                   ↑
      └───── Same user ───┘

User makes 10 requests total, but neither server blocks them!

The example codebase uses Redis with an in-memory fallback:

apps/web/src/server/modules/rate-limit/rate-limit.service.ts
const createRateLimiter = (config: RateLimiterConfig): BurstyRateLimiter => {
  // In-memory fallback limiter
  const memoryLimiter = new RateLimiterMemory({
    points: mergedConfig.points,
    duration: mergedConfig.duration,
  });

  // If no Redis, use memory-only limiter
  if (!redis) {
    const memoryBurstLimiter = new RateLimiterMemory({
      points: mergedConfig.burstPoints,
      duration: mergedConfig.burstDuration,
    });
    return new BurstyRateLimiter(memoryLimiter, memoryBurstLimiter);
  }

  // Redis-backed limiter with memory fallback
  return new BurstyRateLimiter(
    new RateLimiterRedis({
      storeClient: redis,
      rejectIfRedisNotReady: true,
      points: mergedConfig.points,
      duration: mergedConfig.duration,
      keyPrefix: `${RATE_LIMIT_NAMESPACE_KEY}${mergedConfig.keyPrefix}:`,
      insuranceLimiter: memoryLimiter, // Fallback if Redis fails
    })
    // ... burst limiter config
  );
};

Why Insurance Limiters?

The insuranceLimiter option provides resilience:

Redis goes down: Memory limiter takes over, limiting still works
Redis latency spike: Requests aren't blocked waiting for Redis
Development mode: Can work without Redis infrastructure (though there is a local Docker Redis setup)

warning

Insurance limiters are per-server, so protection is degraded (not coordinated) when Redis is unavailable. Monitor Redis health in production.

Use Appropriate Error Responses

When a rate limit is exceeded, provide clear, actionable feedback:

apps/web/src/server/modules/rate-limit/errors.ts
export class TRPCRateLimitError extends TRPCError {
  retryAfterSeconds: number;

  constructor({
    message,
    retryAfterSeconds,
  }: {
    message?: string;
    retryAfterSeconds: number;
  }) {
    super({
      code: "TOO_MANY_REQUESTS",
      message:
        message ??
        `Rate limit exceeded. Please try again in ${retryAfterSeconds} seconds.`,
      cause: {
        retryAfterSeconds,
      },
    });
    this.retryAfterSeconds = retryAfterSeconds;
  }
}

Best Practices for Error Responses

Use HTTP 429 (Too Many Requests): The standard status code for rate limiting
Include Retry-After header: Tell clients when they can retry
Provide human-readable message: Help users understand what happened
Log rate limit events: Track abuse patterns for analysis

Frontend Handling

On the frontend, handle rate limit errors gracefully:

// Example: React Query error handling
const mutation = useMutation({
  mutationFn: submitForm,
  onError: (error) => {
    if (error.data?.code === "TOO_MANY_REQUESTS") {
      const retryAfter = error.cause?.retryAfterSeconds ?? 60;
      toast.error(`Too many attempts. Please wait ${retryAfter} seconds.`);
      return;
    }
    toast.error("Something went wrong. Please try again.");
  },
});

Allow for Per-Endpoint Customization

Different endpoints have different risk profiles. The example allows per-procedure configuration via tRPC meta:

// High-risk endpoint: Stricter limits
login: publicProcedure
  .meta({
    rateLimitOptions: {
      points: 5,
      duration: 60, // 5 attempts per minute
      keyPrefix: "auth:login",
    },
  })
  .mutation(/* ... */);

// Low-risk read endpoint: More lenient
getPublicData: publicProcedure
  .meta({
    rateLimitOptions: {
      points: 100,
      duration: 60, // 100 requests per minute
      keyPrefix: "public:data",
    },
  })
  .query(/* ... */);

// Internal/trusted endpoint: Disable rate limiting
healthCheck: publicProcedure.meta({ rateLimitOptions: null }).query(/* ... */);

Recommended Limits by Endpoint Type

Endpoint Type	Suggested Limit	Rationale
Login/Auth	5-10 / minute	Prevent brute force
OTP Request	3-5 / 10 minutes	Prevent email spam
Form Submission	10-20 / minute	Allow normal usage
API Read	100-1000 / minute	Higher volume expected
File Upload	5-10 / minute	Resource intensive
Admin Actions	20-50 / minute	Sensitive, but trusted users

Namespace Rate Limit Keys

Use prefixes to separate different rate limiters and enable easier debugging:

const defaultConfig: Required<RateLimiterConfig> = {
  // ...
  keyPrefix: "app",
};

// Results in Redis keys like:
// rate-limit:app:userId:abc123
// rate-limit:app:ip:192_168_1_1
// rate-limit-burst:app:userId:abc123

Benefits of Namespacing

Isolation: Different features don't interfere with each other
Debugging: Easy to find and inspect keys in Redis
Cleanup: Can delete all keys for a specific feature
Monitoring: Track usage patterns per feature

info

You should not namespace too aggressively, as this may allow attackers to spam many different endpoints and add pressure to your stack. Use logical groupings (e.g., auth, api, upload) to balance clarity and efficiency.

Example Namespace Structure

rate-limit:
├── app:
│   ├── userId:abc123
│   ├── ip:192_168_1_1
│   └── ip:192_168_1_2
├── auth:
│   ├── userId:abc123
│   └── userId:abc234
└── upload:
    └── userId:abc123

Best Practice 8: Consider Different Rate Limiting Scopes

Rate limiting can be applied at multiple levels:

1. Application Level (Default)

Rate limit all requests from a user/IP across the entire application.

2. Endpoint Level

Different limits for different endpoints (as shown above).

3. Resource Level

Limit operations on specific resources:

// Limit comments per thread
const threadRateLimitKey = `thread:${threadId}:comments:${userId}`;
await checkRateLimit({
  key: threadRateLimitKey,
  options: { points: 10, duration: 60 },
});

4. Global Level

Protect against distributed attacks:

// Global limit across all users
await checkRateLimit({
  key: "global:api",
  options: { points: 10000, duration: 60 },
});

Combining Scopes

For critical endpoints, combine multiple scopes:

// Example: OTP request endpoint
const sendOtp = async (email: string, ipAddress: string) => {
  // 1. Global limit (protect email infrastructure)
  await checkRateLimit({
    key: "global:otp",
    options: { points: 1000, duration: 60 },
  });

  // 2. Per-IP limit (prevent single source abuse)
  await checkRateLimit({
    key: `otp:ip:${ipAddress}`,
    options: { points: 5, duration: 300 },
  });

  // 3. Per-email limit (prevent harassment)
  await checkRateLimit({
    key: `otp:email:${email}`,
    options: { points: 3, duration: 600 },
  });

  // All checks passed, send OTP
  await sendOtpEmail(email);
};

Best Practice 9: Monitor and Alert on Rate Limiting

Rate limiting isn't just about blocking — it's also a signal:

What to Monitor

Metric	Why It Matters
Rate limit hits	High counts may indicate an attack
Unique IPs blocked	Distributed attacks or shared networks
User IDs blocked	Compromised accounts or confused users
Block rate trends	Increasing blocks may indicate new attack patterns

Logging Rate Limit Events

export const checkRateLimit = async ({ key, options }) => {
  try {
    return await limiter.consume(key);
  } catch (error) {
    if (error instanceof RateLimiterRes) {
      // Log for monitoring
      console.warn(
        `Rate limit exceeded: key=${key}, retryAfter=${error.msBeforeNext}ms`
      );

      // Could also send to monitoring service
      // metrics.increment('rate_limit.exceeded', { key_prefix: options.keyPrefix })

      throw new TRPCRateLimitError({
        retryAfterSeconds: Math.ceil(error.msBeforeNext / 1000),
      });
    }
    throw error;
  }
};

Alerting Thresholds

Set up alerts for:

Spike in rate limit hits: Possible attack in progress
Single IP hitting limits repeatedly: Targeted abuse
Many users hitting limits: May indicate limits are too strict
Redis errors: Rate limiting may be degraded

Best Practice 10: Test Rate Limiting Behavior

The example disables rate limiting in tests:

if (env.NODE_ENV === "test") {
  return next();
}

Testing Strategies

Unit Tests: Test rate limit logic in isolation

it("should throw after exceeding limit", async () => {
  for (let i = 0; i < 5; i++) {
    await checkRateLimit({
      key: "test",
      options: { points: 5, duration: 60 },
    });
  }
  await expect(
    checkRateLimit({ key: "test", options: { points: 5, duration: 60 } })
  ).rejects.toThrow(TRPCRateLimitError);
});

Integration Tests: Test with Redis (use separate test database)

beforeEach(async () => {
  await redis.flushdb(); // Clear rate limit state
});

Load Tests: Verify limits hold under stress

# Using k6 or similar
k6 run --vus 100 --duration 60s loadtest.js

Common Pitfalls

1. Rate Limiting After Business Logic

❌ Wrong:

// BAD: Rate limit checked too late
const createThread = async (input) => {
  const thread = await db.thread.create({ data: input }); // Work already done!
  await checkRateLimit({ key: userId }); // Too late to prevent abuse
  return thread;
};

✅ Correct:

// GOOD: Rate limit checked first
const createThread = async (input) => {
  await checkRateLimit({ key: userId }); // Block abusive requests early
  const thread = await db.thread.create({ data: input });
  return thread;
};

2. Trusting Client-Provided Identifiers

❌ Wrong:

// BAD: Client can spoof their identifier
const key = req.body.userId; // Never trust client input for rate limiting!

✅ Correct: Deriving Identifiers Server-Side

// GOOD: Use server-verified identity
const key = ctx.session.userId ?? extractIpAddress(ctx.headers);

3. Single Point of Rate Limiting

❌ Wrong:

// BAD: Only rate limiting login attempts
// Attacker can still spam OTP requests, password resets, etc.

✅ Correct: Defense in Depth

// GOOD: Rate limit at multiple levels
// - Global application limit
// - Per-endpoint limits for sensitive operations
// - Per-resource limits where appropriate

Summary

Rate limiting is essential for building secure, reliable applications. Key takeaways:

Best Practice	Why It Matters
Apply at middleware layer	Reject abusive requests before business logic
Use composite keys	Identify users accurately across auth states
Implement bursty limiting	Allow natural usage patterns while blocking abuse
Use Redis for distributed systems	Ensure consistent limits across servers
Provide clear error responses	Help users understand and recover
Customize per endpoint	Match limits to risk profile
Namespace keys	Enable debugging and monitoring
Apply multiple scopes	Defense in depth
Monitor and alert	Detect and respond to attacks
Test thoroughly	Verify limits work as intended

tip

Remember: Rate limiting is just one layer of defense. Combine it with authentication, authorization, input validation, and other security measures for comprehensive protection.

Why were certain things implemented the way they were?​

Apply Rate Limiting at the Middleware Layer​

Why Middleware?​

Opt-in vs Opt-out​

Using Composite Keys for Identification​

Key Strategy Considerations​

Handling IPv6​

Implement Bursty Rate Limiting​

How Bursty Limiting Works​

When to Adjust Burst Settings​

Use Redis for Distributed Rate Limiting​

Why Insurance Limiters?​

Use Appropriate Error Responses​

Best Practices for Error Responses​

Frontend Handling​

Allow for Per-Endpoint Customization​

Recommended Limits by Endpoint Type​

Namespace Rate Limit Keys​

Benefits of Namespacing​

Example Namespace Structure​

Best Practice 8: Consider Different Rate Limiting Scopes​

1. Application Level (Default)​

2. Endpoint Level​

3. Resource Level​

4. Global Level​

Combining Scopes​

Best Practice 9: Monitor and Alert on Rate Limiting​

What to Monitor​

Logging Rate Limit Events​

Alerting Thresholds​

Best Practice 10: Test Rate Limiting Behavior​

Testing Strategies​

Common Pitfalls​

1. Rate Limiting After Business Logic​

2. Trusting Client-Provided Identifiers​

3. Single Point of Rate Limiting​

Summary​

Further Reading​

Why were certain things implemented the way they were?

Apply Rate Limiting at the Middleware Layer

Why Middleware?

Opt-in vs Opt-out

Using Composite Keys for Identification

Key Strategy Considerations

Handling IPv6

Implement Bursty Rate Limiting

How Bursty Limiting Works

When to Adjust Burst Settings

Use Redis for Distributed Rate Limiting

Why Insurance Limiters?

Use Appropriate Error Responses

Best Practices for Error Responses

Frontend Handling

Allow for Per-Endpoint Customization

Recommended Limits by Endpoint Type

Namespace Rate Limit Keys

Benefits of Namespacing

Example Namespace Structure

Best Practice 8: Consider Different Rate Limiting Scopes

1. Application Level (Default)

2. Endpoint Level

3. Resource Level

4. Global Level

Combining Scopes

Best Practice 9: Monitor and Alert on Rate Limiting

What to Monitor

Logging Rate Limit Events

Alerting Thresholds

Best Practice 10: Test Rate Limiting Behavior

Testing Strategies

Common Pitfalls

1. Rate Limiting After Business Logic

2. Trusting Client-Provided Identifiers

3. Single Point of Rate Limiting

Summary

Further Reading