Project Requirements Document

Overview

This document outlines the requirements for implementing rate limiting across the application to protect against abuse, ensure fair usage, and maintain service reliability. Rate limiting controls how many requests a client can make within a given time period.

info

Implementation Status: Rate limiting is not yet implemented. This PRD describes the complete implementation required to protect all API endpoints.

warning

Security Priority: Rate limiting is a critical security feature. Without it, the application may be more vulnerable to brute force attacks, denial of service, credential stuffing, and email spam abuse.

Product Context

Purpose

Implement rate limiting to:

Protect against denial of service (DoS) attacks
Prevent brute force attacks on authentication endpoints
Stop credential stuffing attempts
Limit email spam from OTP/notification endpoints
Ensure fair resource usage across all users
Maintain service availability during traffic spikes

Target Threats

Threat	Description	Without Rate Limiting
DoS Attack	Flooding server with requests	Service becomes unavailable
Brute Force	Automated password/OTP guessing	Account compromise
Credential Stuffing	Using leaked credentials for mass logins	Unauthorized access
Email Spam	Triggering excessive OTP emails	Provider blacklisting, user harassment
API Abuse	Scraping or exhausting API resources	Increased costs, degraded performance
Resource Exhaustion	Overwhelming database/compute resources	Service degradation for legitimate users

Success Metrics

All API endpoints are protected by rate limiting by default
Authentication endpoints have stricter, customized limits
Legitimate users never encounter rate limits during normal usage
Abusive traffic is blocked before consuming significant resources
Rate limit events are logged for security monitoring
- (You will learn more about structured logging in a later lab. For now, you can log to the console using console.log().)
System gracefully degrades if Redis becomes unavailable

Implementation Status

Component	Status	Priority
Rate limit service	🔲 TODO	High
Redis integration	🔲 TODO	High
tRPC middleware	🔲 TODO	High
Per-endpoint configuration	🔲 TODO	High
Fingerprinting (user/IP)	🔲 TODO	High
Custom error responses	🔲 TODO	Medium
Memory fallback	🔲 TODO	Medium
Monitoring/logging	🔲 TODO	Medium
Frontend error handling	🔲 TODO	Low

Prereads

info

Strongly recommended to read before starting this lab, so that you understand the theory and rationale behind rate limiting and the various implementation details and strategies to ensure a robust solution.

rate-limiter-flexible Wiki

User Stories

Security & Protection

ID	As a...	I want...	So that...	Priority
RL-1	System operator	All endpoints rate limited by default	New endpoints are automatically protected	High
RL-2	System operator	Authentication endpoints to have stricter limits	Brute force attacks are prevented	High
RL-3	System operator	OTP endpoints to have strict per-email limits	Users aren't harassed with spam emails	High
RL-4	System operator	Rate limiting to work across multiple servers	Attackers can't bypass limits via load balancing	High
RL-5	System operator	Rate limiting to continue if Redis fails	Service remains protected during Redis outages	Medium

User Experience

ID	As a...	I want...	So that...	Priority
RL-6	Authenticated user	Normal usage to never trigger rate limits	My experience isn't disrupted	High
RL-7	User	Clear error messages when rate limited	I understand why my request failed	Medium
RL-8	User	To know when I can retry	I don't keep trying and getting blocked	Medium
RL-9	User	Quick page navigation without hitting limits	I can browse normally	High

Developer Experience

ID	As a...	I want...	So that...	Priority
RL-10	Developer	To easily customize limits per endpoint	I can tune limits for specific use cases	Medium
RL-11	Developer	To disable rate limiting for specific endpoints	Health checks and internal endpoints work freely	Medium
RL-12	Developer	Rate limiting disabled in test environment	Tests run quickly without artificial delays	Medium
RL-13	Developer	Clear logging of rate limit events	I can debug and monitor the system	Medium

API Client (Optional, Future Consideration)

info

These user stories are only relevant for applications with API consumers integrating with our services. You do not have to implement these requirements in this lab. It will be covered in future labs on designing for API consumers.

ID	As a...	I want...	So that...	Priority
RL-16	API client	Consistent retry-after values in 429 responses	I can implement reliable exponential backoff	-
RL-14	API client	To receive rate limit headers in responses	I can proactively manage my request rate	-
RL-18	API client	Different rate limits for different API keys	I can upgrade for higher limits if needed	-

Functional Requirements

FR-1: Rate Limit Service

FR-1.1: System MUST implement a rate limit service that tracks request counts per key
FR-1.2: Service MUST support configurable limits (points, duration, burst points, burst duration)
FR-1.3: Service MUST implement bursty rate limiting to allow natural usage patterns
FR-1.4: Service MUST cache rate limiter instances to avoid recreation overhead
FR-1.5: Service MUST use the rate-limiter-flexible library for implementation

info

The rate-limiter-flexible library provides robust rate limiting features, including support for Redis storage and bursty limiting. It is generally not recommended to implement your own rate limiting mechanisms. Refer to its documentation for more details.

FR-2: Storage Backend

FR-2.1: System MUST use distributed storage (e.g. Redis, DB) as the primary storage for rate limit counters (Redis is preferred)
FR-2.2: System MUST implement in-memory fallback when the distributed storage is unavailable
FR-2.3: The rate limiter MUST use insurance strategy for resilience
FR-2.4: System MUST handle storage connection failures gracefully
FR-2.5: Rate limit keys MUST be namespaced to prevent collisions

FR-3: Request Fingerprinting

FR-3.1: System MUST identify requests using a composite key strategy
FR-3.2: Authenticated requests MUST be keyed by user ID
- This is to ensure that logged-in users are rate limited based on their account, regardless of IP address
FR-3.3: Unauthenticated requests MUST be keyed by a unique fingerprint (usually the IP address)
FR-3.4: IPv6 addresses MUST be sanitized for Redis key compatibility
FR-3.5: Unknown identifiers MUST fall back to a safe default key

FR-4: Middleware Integration

FR-4.1: Rate limiting MUST be implemented as tRPC middleware
FR-4.2: Middleware MUST apply to all procedures by default (opt-out approach)
FR-4.3: Middleware MUST check rate limits BEFORE executing procedure logic
FR-4.4: Middleware MUST support per-procedure configuration via a metadata configuration
FR-4.5: Middleware MUST allow procedures to opt-out of rate limiting
FR-4.6: Middleware MUST not affect testing environments (i.e., disabled when NODE_ENV === 'test')

FR-5: Default Configuration

FR-5.1: There should be a default rate limit configuration applied to all endpoints:
- Suggested sustained limit: 2 requests per second
- Suggested bursty limit: 5 requests per 10 seconds
FR-5.2: Defaults MUST be overridable per procedure

FR-6: Error Handling

FR-6.1: Rate limit exceeded MUST return TOO_MANY_REQUESTS error code
FR-6.2: Error response header MUST include Retry-After indicating when the client can retry
FR-6.3: System MUST log rate limit exceeded events for monitoring

FR-7: Endpoint-Specific Limits

The following endpoints require custom rate limits:

Endpoint Category	Limit	Duration	Rationale
Login/Auth	5	60s	Prevent brute force & email spam
Thread Create	10	60s	Prevent spam
Comment Create	20	60s	Allow discussion

Non-Functional Requirements

NFR-1: Performance

NFR-1.1: Rate limit check MUST complete within 10ms under normal conditions
NFR-1.2: Rate limiter instances MUST be cached to avoid recreation
NFR-1.3: Rate limiter storage operations SHOULD use connection pooling
NFR-1.4: Memory fallback MUST not significantly impact response time

NFR-2: Reliability

NFR-2.1: System MUST continue functioning if distributed storage becomes unavailable
NFR-2.2: Insurance limiter MUST provide degraded protection during distributed storage outage
NFR-2.3: Rate limiting MUST NOT cause application crashes
NFR-2.4: System MUST handle malformed IP addresses gracefully

NFR-3: Security

NFR-3.1: Rate limit keys MUST be derived server-side, never from client input
NFR-3.2: Rate limiting MUST occur before business logic execution
NFR-3.3: System MUST NOT leak information about other users' rate limit status
NFR-3.4: Logs MUST NOT contain sensitive user data

NFR-4: Observability

NFR-4.1: All rate limit exceeded events SHOULD be logged
NFR-4.2: Logs SHOULD include key prefix, timestamp, and retry-after value
NFR-4.3: System SHOULD support metrics export for monitoring dashboards
NFR-4.4: Alerts SHOULD be configurable for rate limit spikes

NFR-5: Developer Experience

NFR-5.1: Configuration API MUST be type-safe
NFR-5.2: Rate limiting MUST be easily toggleable per endpoint
NFR-5.3: Test utilities SHOULD be provided for testing rate limit behavior

Technical Architecture

Component Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        tRPC Router                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐    ┌──────────────────┐    ┌───────────────┐   │
│  │   Request   │───▶│ Rate Limit       │───▶│   Procedure   │   │
│  │             │    │ Middleware       │    │   Handler     │   │
│  └─────────────┘    └────────┬─────────┘    └───────────────┘   │
│                              │                                  │
└──────────────────────────────┼──────────────────────────────────┘
                               │
                               ▼
                    ┌────────────────────┐
                    │ Rate Limit         │
                    │ Service            │
                    ├────────────────────
                    │ • checkRateLimit   │
                    │ • createFingerprint│
                    │ • createLimiter    │
                    └────────┬───────────┘
                             │
              ┌──────────────┴───────────────┐
              ▼                              ▼
    ┌──────────────────┐          ┌──────────────────┐
    │ Redis Limiter    │          │ Memory Limiter   │
    │ (Primary)        │          │ (Fallback)       │
    ├──────────────────┤          ├──────────────────┤
    │ • Distributed    │          │ • Per-server     │
    │ • Persistent     │          │ • Fast           │
    │ • Coordinated    │          │ • No dependencies│
    └──────────────────┘          └──────────────────┘

Possible Data Flow

File Structure

apps/web/src/server/
├── api/
│   └── trpc.ts                     # Rate limit middleware integration
└── modules/
    └── rate-limit/
        ├── index.ts                # Public exports
        ├── rate-limit.service.ts   # Core rate limiting logic
        ├── types.ts                # TypeScript interfaces
        └── errors.ts               # Custom error classes

packages/redis/
└── src/
    └── index.ts                    # Redis client singleton

API Specification

info

The files have been scaffolded for you in the lab repository. You need to implement the logic as per the requirements.

Rate Limiter Configuration Interface

interface RateLimiterConfig {
  /** Number of points (requests) allowed in the duration window */
  points?: number;
  /** Duration window in seconds for sustained rate */
  duration?: number;
  /** Number of points allowed for burst traffic */
  burstPoints?: number;
  /** Duration window in seconds for burst rate */
  burstDuration?: number;
  /** Prefix for Redis keys (for namespacing) */
  keyPrefix?: string;
}

tRPC Meta Interface

interface Meta {
  /**
   * Rate limit options for this procedure.
   * - undefined: Apply default rate limiting
   * - RateLimiterConfig: Apply custom rate limiting
   * - null: Disable rate limiting for this procedure
   */
  rateLimitOptions?: RateLimiterConfig | null;
}

Rate Limit Service Functions

// Check rate limit for a key, throws TRPCRateLimitError if exceeded
checkRateLimit(params: {
  key: string
  options?: RateLimiterConfig
}): Promise<void>

// Create a fingerprint key from user/IP information
createRateLimitFingerprint(params: {
  userId: string | undefined
  ipAddress: string | null
}): string

Error Response

// HTTP 429 Too Many Requests
{
  error: {
    code: "TOO_MANY_REQUESTS",
    message: "Rate limit exceeded. Please try again in 30 seconds.",
  }
}

Best Practices Requirements

The implementation SHOULD follow the best practices:

Best Practice	Requirement	Validation
BP-1: Middleware Layer	Rate limiting MUST be applied at the middleware layer before business logic	Code review
BP-2: Composite Keys	MUST use userId for authenticated, IP for unauthenticated requests	Unit tests
BP-3: Bursty Limiting	MUST implement dual-bucket (sustained + burst) rate limiting	Unit tests
BP-4: Redis Storage	SHOULD use Redis for distributed coordination with memory fallback	Integration tests
BP-5: Error Responses	MUST return 429 status with retryAfterSeconds	API tests
BP-6: Per-Endpoint Config	MUST support custom limits via tRPC metadata field	Code review
BP-7: Namespaced Keys	MUST use keyPrefix to namespace rate limit keys	Unit tests
BP-8: Multiple Scopes	SHOULD support per-procedure and global rate limits	Design review
BP-9: Monitoring	MUST log rate limit exceeded events	Log review
BP-10: Test Support	SHOULD skip rate limiting (or opt into rate limiting) in test environment	Test verification

Common Pitfalls to Avoid

Pitfall	Mitigation	Validation
Rate limiting after business logic	Middleware checks BEFORE handler	Code review
Trusting client identifiers	Derive fingerprint from server context only	Security review
Single point of rate limiting	Apply default + custom limits	Design review
No fallback for Redis failure	Insurance limiter pattern	Chaos testing
Blocking legitimate users	Bursty limiting with appropriate defaults	User testing

Acceptance Criteria

AC-1: Default Rate Limiting

All tRPC procedures are rate limited by default
Default limit is 2 requests per 10 seconds sustained
Default burst allowance is 5 requests per 10 seconds
Exceeding limit returns HTTP 429 with retryAfterSeconds

AC-2: Authentication Endpoints

Login endpoint has custom limit of 5 requests per 60 seconds
OTP request endpoint has limit of 3 requests per 10 minutes
Rate limit applies per-email for OTP requests

AC-3: Fingerprinting

Authenticated requests are keyed by userId
Unauthenticated requests are keyed by IP address
IPv6 addresses are properly sanitized
Fallback to "unknown" for missing identifiers

AC-4: Redis Integration

Rate limit counters are stored in Redis
Keys are properly namespaced with prefix
Memory fallback activates when Redis unavailable
No application crash on Redis connection failure

AC-5: Developer Experience

Procedures can opt-out with rateLimitOptions: null
Procedures can customize with rateLimitOptions: { ... }

AC-6: User Experience

Normal browsing (5 page loads in 10 seconds) doesn't trigger limits
Error message clearly explains rate limit and retry time
Frontend handles 429 errors gracefully with toast notification

AC-7: Observability

Rate limit exceeded events are logged with key and retry time
Logs do not contain sensitive user information
Key prefix is included in logs for filtering

Implementation Checklist

Phase 1: Core Infrastructure

Create packages/redis with Redis client singleton
Create rate-limit module directory structure
Implement RateLimiterConfig TypeScript interface
Implement TRPCRateLimitError custom error class

Phase 2: Rate Limit Service

Implement createRateLimiter with Redis + memory fallback
Implement BurstyRateLimiter with dual-bucket approach
Implement createRateLimitFingerprint function
Implement checkRateLimit function with error handling

Phase 3: Middleware Integration

Add Meta interface with rateLimitOptions
Implement rateLimitMiddleware in tRPC setup
Apply middleware to default procedure
Add test environment bypass

Phase 4: Endpoint Configuration

Configure authentication endpoints with custom limits
Configure OTP endpoints with per-email limits
Configure read endpoints with higher limits
Document all custom configurations

Phase 5: Frontend & Polish

Add frontend error handling for 429 responses
Add toast notifications for rate limit errors
Verify logging and monitoring
Update documentation

Overview​

Product Context​

Purpose​

Target Threats​

Success Metrics​

Implementation Status​

Prereads​

User Stories​

Security & Protection​

User Experience​

Developer Experience​

API Client (Optional, Future Consideration)​

Functional Requirements​

FR-1: Rate Limit Service​

FR-2: Storage Backend​

FR-3: Request Fingerprinting​

FR-4: Middleware Integration​

FR-5: Default Configuration​

FR-6: Error Handling​

FR-7: Endpoint-Specific Limits​

Non-Functional Requirements​

NFR-1: Performance​

NFR-2: Reliability​

NFR-3: Security​

NFR-4: Observability​

NFR-5: Developer Experience​

Technical Architecture​

Component Diagram​

Possible Data Flow​

File Structure​

API Specification​

Rate Limiter Configuration Interface​

tRPC Meta Interface​

Rate Limit Service Functions​

Error Response​

Best Practices Requirements​

Common Pitfalls to Avoid​

Acceptance Criteria​

AC-1: Default Rate Limiting​

AC-2: Authentication Endpoints​

AC-3: Fingerprinting​

AC-4: Redis Integration​

AC-5: Developer Experience​

AC-6: User Experience​

AC-7: Observability​

Implementation Checklist​

Phase 1: Core Infrastructure​

Phase 2: Rate Limit Service​

Phase 3: Middleware Integration​

Phase 4: Endpoint Configuration​

Phase 5: Frontend & Polish​

References​

Overview

Product Context

Purpose

Target Threats

Success Metrics

Implementation Status

Prereads

User Stories

Security & Protection

User Experience

Developer Experience

API Client (Optional, Future Consideration)

Functional Requirements

FR-1: Rate Limit Service

FR-2: Storage Backend

FR-3: Request Fingerprinting

FR-4: Middleware Integration

FR-5: Default Configuration

FR-6: Error Handling

FR-7: Endpoint-Specific Limits

Non-Functional Requirements

NFR-1: Performance

NFR-2: Reliability

NFR-3: Security

NFR-4: Observability

NFR-5: Developer Experience

Technical Architecture

Component Diagram

Possible Data Flow

File Structure

API Specification

Rate Limiter Configuration Interface

tRPC Meta Interface

Rate Limit Service Functions

Error Response

Best Practices Requirements

Common Pitfalls to Avoid

Acceptance Criteria

AC-1: Default Rate Limiting

AC-2: Authentication Endpoints

AC-3: Fingerprinting

AC-4: Redis Integration

AC-5: Developer Experience

AC-6: User Experience

AC-7: Observability

Implementation Checklist

Phase 1: Core Infrastructure

Phase 2: Rate Limit Service

Phase 3: Middleware Integration

Phase 4: Endpoint Configuration

Phase 5: Frontend & Polish

References