02 — System Design Interview Guide

Priority: HIGH — With 5+ years of experience, this is your strongest differentiator. You’ve built real systems at scale — now learn to articulate it in interview format.

The System Design Interview Framework
Core Concepts You Must Know
Back-of-the-Envelope Estimation
Top 20 System Design Problems
Deep Dives: 5 Detailed Designs
Your Experience as System Design Ammo
Common Trade-offs to Discuss
Resources

The System Design Interview Framework

Use this RESHADED framework for every problem (45-60 min):

R — Requirements (5 min)
    Functional: What should the system do?
    Non-functional: Scale, latency, availability, consistency

E — Estimation (5 min)
    Users, QPS, storage, bandwidth
    Back-of-envelope math to size the system

S — Storage Schema (5 min)
    Data model: tables, relationships, indexes
    SQL vs NoSQL decision with justification

H — High-Level Design (10 min)
    Draw the architecture: clients → LB → services → DB/cache
    Identify major components and data flow

A — API Design (5 min)
    Key endpoints: method, path, params, response
    Authentication, rate limiting, pagination

D — Detailed Design (10 min)
    Dive deep into 1-2 critical components
    This is where you show depth and trade-off reasoning

E — Evaluate (5 min)
    How does the design handle failures?
    What are the bottlenecks? How would you scale further?
    What would you change for 10x more traffic?

D — Distinctive Features (optional, 5 min)
    Monitoring, alerting, deployment strategy
    Nice-to-have features that show you think beyond the basics

Pro Tips for the Interview

Always start by asking questions. Never jump into design.
Think out loud. The interviewer wants to see your thought process.
Draw diagrams. Use boxes, arrows, labels. Keep it clean.
Mention trade-offs. “I chose X over Y because…”
Use numbers. “With 10M users at 10 QPS each, that’s 100M QPS…”
Don’t go too deep too early. Get the high-level right first.

Core Concepts You Must Know

1. Scalability

Vertical Scaling (Scale Up):
  - Bigger machine: more CPU, RAM, storage
  - Simpler but has hard limits
  - Good for databases (initially)

Horizontal Scaling (Scale Out):
  - More machines
  - Need: load balancing, data partitioning, stateless services
  - Better for web/app tier

2. Load Balancing

What: Distributes incoming traffic across multiple servers.

Algorithms:
  - Round Robin: simple, equal distribution
  - Weighted Round Robin: heavier instances get more traffic
  - Least Connections: sends to least busy server
  - IP Hash: consistent routing for same client
  - Consistent Hashing: minimizes data movement on scale events

Layers:
  - L4 (Transport): TCP/UDP level, faster, no content inspection
  - L7 (Application): HTTP level, content-based routing, path/header routing

Tools: AWS ALB/NLB, Nginx, HAProxy, Envoy

3. Caching

Why: Reduce latency, reduce DB load, improve throughput.

Cache Strategies:
  - Cache-aside (Lazy Loading):
      App checks cache → miss → read from DB → write to cache
      Pros: Only caches what's needed
      Cons: Cache miss penalty, potential stale data

  - Write-through:
      App writes to cache and DB simultaneously
      Pros: Cache always consistent
      Cons: Write latency, caches unused data

  - Write-behind (Write-back):
      App writes to cache → cache asynchronously writes to DB
      Pros: Fast writes
      Cons: Data loss risk if cache crashes

  - Read-through:
      Cache sits between app and DB, loads on miss automatically

Cache Eviction Policies:
  - LRU (Least Recently Used) — most common
  - LFU (Least Frequently Used)
  - TTL (Time To Live) — expiry-based

Where to cache:
  - Client-side (browser cache)
  - CDN (static assets, edge caching)
  - Application cache (Redis, Memcached)
  - Database cache (query cache, buffer pool)

Tools: Redis, Memcached, Varnish, CDN (CloudFront, Cloudflare)

4. Database Design

SQL (Relational):
  Strengths: ACID, joins, structured data, mature tooling
  Use when: Strong consistency needed, complex queries, relationships matter
  Examples: PostgreSQL, MySQL

NoSQL Types:
  - Key-Value: Redis, DynamoDB (sessions, caching)
  - Document: MongoDB (flexible schema, nested data)
  - Column-family: Cassandra (wide columns, time-series)
  - Graph: Neo4j (relationships, social networks)

Decision Framework:
  Need ACID + complex queries? → SQL
  Need horizontal scale + simple lookups? → Key-Value / Document
  Need time-series at massive scale? → Column-family
  Need relationship traversal? → Graph

5. Database Scaling

Read Replicas:
  - Write to primary, read from replicas
  - Replication lag = eventual consistency for reads
  - Good for read-heavy workloads (80%+ reads)

Sharding (Horizontal Partitioning):
  - Split data across multiple databases
  - Shard key: determines which shard holds data
  - Range-based: user_id 0-1M → shard 1 (risk: hot spots)
  - Hash-based: hash(user_id) % N → shard N (better distribution)
  - Challenges: cross-shard queries, rebalancing, shard key choice

Vertical Partitioning:
  - Split tables by columns (e.g., user profile vs user activity)
  - Different tables on different machines

Denormalization:
  - Duplicate data to avoid joins
  - Trade storage for read performance
  - Keep in sync with triggers or application logic

6. Message Queues & Async Processing

Why: Decouple services, handle spikes, enable async processing.

Components:
  Producer → Queue → Consumer

Patterns:
  - Point-to-point: one consumer per message
  - Pub/Sub: multiple consumers (topics/subscriptions)
  - Work queue: multiple workers process from same queue

Guarantees:
  - At-most-once: no retry (fast, may lose messages)
  - At-least-once: retry on failure (duplicates possible)
  - Exactly-once: hardest (deduplication + transactions)

Tools: RabbitMQ, Kafka, AWS SQS/SNS, Redis Streams

Your experience: "At Intensel, I designed async workflows with queues,
workers, retries, and scheduling for high-volume background jobs."
→ You literally built this. Talk about RabbitMQ + Dask setup.

7. CAP Theorem

In a distributed system, you can only guarantee 2 of 3:
  C — Consistency: every read gets the latest write
  A — Availability: every request gets a response
  P — Partition tolerance: system works despite network failures

Reality: P is non-negotiable (networks fail), so you choose between:
  CP: Consistent + Partition-tolerant (sacrifice availability)
      → HBase, MongoDB (strong consistency mode), ZooKeeper
  AP: Available + Partition-tolerant (sacrifice consistency)
      → Cassandra, DynamoDB, DNS

In practice:
  - Most systems are "mostly consistent, mostly available"
  - Different data can have different guarantees:
      Financial transactions → strong consistency
      Social media likes → eventual consistency

8. Consistent Hashing

Problem: Simple hash(key) % N breaks when N changes (add/remove server).

Solution: Consistent hashing maps both keys and servers to a ring.
  - Only K/N keys need to move when a server is added/removed
  - Virtual nodes improve distribution (each server → multiple points on ring)

Use cases:
  - Distributed caches (Redis cluster)
  - Load balancing (sticky sessions)
  - Database sharding
  - CDN routing

9. API Gateway

What: Single entry point for all client requests.

Responsibilities:
  - Routing: forward to correct service
  - Authentication/Authorization
  - Rate limiting
  - Request/response transformation
  - SSL termination
  - Logging and monitoring

Tools: Kong, AWS API Gateway, Nginx, Envoy

10. CDN (Content Delivery Network)

What: Distributed network of edge servers that cache content close to users.

Types:
  - Pull: CDN fetches from origin on first request, caches it
  - Push: You upload content to CDN proactively

Use for: Static assets (images, CSS, JS), video streaming, API responses

Tools: CloudFront, Cloudflare, Akamai, Fastly

11. Rate Limiting

Why: Protect against abuse, ensure fair usage, prevent overload.

Algorithms:
  - Token Bucket: tokens added at fixed rate, consumed per request
  - Leaky Bucket: requests processed at fixed rate
  - Fixed Window: count requests in fixed time windows
  - Sliding Window Log: timestamps of recent requests
  - Sliding Window Counter: hybrid of fixed window + sliding

Implementation:
  - At API gateway level (global)
  - Per-service (service-level)
  - Using Redis (distributed rate limiting)

12. Replication & Consensus

Replication:
  - Single-leader: one primary, multiple replicas (PostgreSQL, MySQL)
  - Multi-leader: multiple primaries (conflict resolution needed)
  - Leaderless: any node accepts writes (Cassandra, DynamoDB)

Consensus Protocols:
  - Raft: leader election, log replication (easier to understand)
  - Paxos: classic but complex
  - ZAB: ZooKeeper's protocol

When asked about this:
  - Explain in terms of your PostgreSQL replication experience
  - Relate to your work with distributed Dask workers

Back-of-the-Envelope Estimation

Key Numbers to Memorize

Storage:
  1 KB  = 1,000 bytes        (a short text record)
  1 MB  = 1,000 KB           (a photo)
  1 GB  = 1,000 MB           (a movie)
  1 TB  = 1,000 GB           (large database)
  1 PB  = 1,000 TB           (enterprise data warehouse)

Time:
  1 ns  = L1 cache reference
  10 ns = L2 cache reference
  100 ns = RAM access
  1 ms  = SSD random read
  10 ms = HDD seek
  150 ms = Round trip within same datacenter
  150 ms = Send packet one way across the world

Traffic:
  1 day = 86,400 seconds ≈ 100,000 seconds (for estimation)
  1 million requests/day ≈ 12 requests/second
  1 billion requests/day ≈ 12,000 requests/second

Quick math:
  100M users × 10% daily active = 10M DAU
  10M DAU × 5 requests/day = 50M requests/day
  50M / 100K seconds ≈ 500 QPS
  Peak = 2-3x average = 1000-1500 QPS

Estimation Template

1. Users: How many total? Daily active?
2. Read/Write ratio: Read-heavy (100:1)? Write-heavy?
3. QPS: DAU × actions per day / 86400
4. Peak QPS: 2-3x average
5. Storage: Records × size per record × retention period
6. Bandwidth: QPS × average response size
7. Memory (cache): If we cache 20% of hot data...

#	Problem	Key Concepts
1	Design a URL Shortener	Hashing, base62 encoding, redirection, analytics
2	Design Twitter/News Feed	Fan-out, timeline generation, caching, pub/sub
3	Design a Chat System (WhatsApp/Slack)	WebSockets, message queues, presence, delivery guarantees
4	Design a Rate Limiter	Token bucket, sliding window, distributed rate limiting
5	Design a Notification System	Push/pull, prioritization, deduplication, multiple channels

#	Problem	Key Concepts
6	Design an API Rate Limiter	Token bucket, Redis, distributed counting
7	Design a Key-Value Store	Consistent hashing, replication, conflict resolution
8	Design a Web Crawler	BFS, politeness, deduplication, distributed crawling
9	Design YouTube/Netflix	Video encoding, CDN, adaptive bitrate, recommendations
10	Design Uber/Lyft	Geospatial indexing, matching, real-time tracking

#	Problem	Key Concepts
11	Design Google Search	Inverted index, ranking, crawling, caching
12	Design Dropbox/Google Drive	File chunking, sync, conflict resolution, dedup
13	Design Instagram	Image storage, news feed, CDN, caching
14	Design a Task Queue	Worker pools, retries, dead letter queue, priorities
15	Design an E-commerce System	Inventory management, cart, checkout, payments

#	Problem	Key Concepts
16	Design a Geospatial Service	PostGIS, spatial indexing, tile serving, caching
17	Design a Data Pipeline	ETL, batch vs stream, exactly-once, schema evolution
18	Design a Metrics/Monitoring System	Time-series DB, aggregation, alerting, dashboards
19	Design a Map Tile Server	Tile pyramids, caching layers, CDN, auth
20	Design a Distributed Task Scheduler	Queues, workers, retries, scheduling, backpressure

Deep Dives: 5 Detailed Designs

Design 1: URL Shortener (TinyURL)

Requirements:
  - Shorten long URLs → short code (e.g., tiny.url/abc123)
  - Redirect short → long URL
  - Analytics (click count, geo, timestamp)
  - Custom short URLs (optional)
  - Expiry (optional)

Estimation:
  - 100M new URLs/month, 10:1 read/write
  - Write: 100M / (30 * 86400) ≈ 40 URLs/sec
  - Read: 400 redirects/sec, peak: 1200/sec
  - Storage: 100M * 500 bytes * 12 months ≈ 600 GB/year

High-Level Design:
  Client → API Gateway → URL Service → Database
                                      → Cache (Redis)

API Design:
  POST /api/urls  {long_url, custom_alias?, expiry?}  → {short_url}
  GET  /{shortCode}  → 301/302 Redirect to long_url

Database Schema:
  urls:
    id (PK, bigint)
    short_code (unique index, varchar(7))
    long_url (text)
    created_at (timestamp)
    expires_at (timestamp, nullable)
    user_id (FK, nullable)

Key Decisions:
  1. Short code generation:
     - Base62 encoding of auto-increment ID (predictable but simple)
     - MD5/SHA256 hash → take first 7 chars (collision possible)
     - Pre-generated IDs with a counter service (distributed-safe)

  2. 301 vs 302 redirect:
     - 301 (permanent): browser caches, less server load, less analytics
     - 302 (temporary): every request hits server, better for analytics

  3. Read optimization:
     - Cache hot URLs in Redis (LRU eviction)
     - Cache hit ratio likely 80%+ (Zipf distribution)

  4. Scaling:
     - Read replicas for DB
     - Shard by short_code hash
     - Redis cluster for cache

Design 2: Chat System (like Slack/WhatsApp)

Requirements:
  - 1:1 and group messaging
  - Online/offline status
  - Message history
  - Read receipts
  - Push notifications for offline users

Estimation:
  - 50M DAU, 40 messages/day per user
  - Messages: 2 billion/day ≈ 23K/sec, peak 70K/sec
  - Storage: 2B * 200 bytes = 400 GB/day

High-Level Design:
  Client ↔ WebSocket Gateway ↔ Chat Service → Message Queue
                                              → Message DB
                                              → Presence Service
                                              → Notification Service

Key Components:
  1. WebSocket Gateway:
     - Maintains persistent connections
     - Routes messages to correct recipient connection
     - Handles reconnection, heartbeats

  2. Chat Service:
     - Message validation, storage, fan-out
     - Group message distribution

  3. Presence Service:
     - Track online/offline status
     - Heartbeat-based detection
     - Pub/sub for status updates

  4. Message Storage:
     - Recent messages: Redis (fast access, bounded size)
     - Historical: Cassandra or PostgreSQL partitioned by conversation + time
     - Append-only, sequential writes

  5. Delivery:
     - Online → push via WebSocket
     - Offline → store + push notification (APNs/FCM)
     - Retry with exponential backoff

Design 3: Task Queue / Background Job System

Directly relevant to your Intensel experience!

Requirements:
  - Submit async jobs with parameters
  - Priority-based execution
  - Retries with backoff
  - Job status tracking
  - Scheduling (cron-like)
  - Scalable workers

High-Level Design:
  API → Job Service → Message Queue (RabbitMQ/SQS)
                    → Job DB (status tracking)
                    → Worker Pool → Result Store

  Scheduler (cron) → Job Service

Key Decisions:
  1. Queue choice:
     - RabbitMQ: mature, routing, priority queues, ACK-based
     - SQS: managed, at-least-once, no priority (use multiple queues)
     - Kafka: if you need replay-ability and high throughput
     - Redis: simple, fast, but less durable

  2. Retry strategy:
     - Exponential backoff: 1s, 2s, 4s, 8s, ...
     - Max retries: 3-5 depending on job type
     - Dead Letter Queue for permanently failed jobs
     - Idempotent job handlers (retries should be safe)

  3. Worker scaling:
     - Horizontal: add more worker instances
     - Auto-scale based on queue depth
     - Heartbeat/health checks for stuck workers
     - Graceful shutdown (finish current job)

  4. Job status lifecycle:
     PENDING → QUEUED → RUNNING → SUCCESS/FAILED/RETRY

  5. Scheduling:
     - Cron-like scheduler checks periodically
     - Inserts due jobs into the queue
     - Use DB for schedule persistence, not in-memory

Design 4: Geospatial Service (Map Tile Server)

This is YOUR project — you’ve built this!

Requirements:
  - Serve raster map tiles at various zoom levels
  - Authenticated access per customer
  - Handle 5.3 TB of geospatial data
  - Low latency, high throughput
  - Caching for repeated tile requests

High-Level Design:
  Client (Mapbox GL) → CDN → API Gateway (auth) → Tile Service (FastAPI)
                                                  → Tile Cache (Redis/disk)
                                                  → MapServer → Data Store (SQLite/PostGIS)

Key Decisions:
  1. Tile pyramid structure:
     - Pre-render common zoom levels (0-14)
     - On-demand render for deep zooms
     - Z/X/Y addressing standard

  2. Caching strategy:
     - Multi-layer: CDN → Redis → Disk → Re-render
     - Cache key: {layer}/{z}/{x}/{y}/{style_hash}
     - TTL based on data update frequency
     - Cache invalidation on data updates

  3. Authentication:
     - JWT tokens verified at API gateway
     - Per-customer access control to specific layers
     - Token refresh for long map sessions

  4. Scaling:
     - Stateless tile service → horizontal scaling
     - CDN handles 90%+ of repeated tile requests
     - Background pre-rendering for hot areas
     - Auto-scale workers during peak hours

How to talk about this:
  "I designed and implemented a tile delivery service using FastAPI and
  MapServer, serving 5.3 TB of authenticated geospatial data. I added
  multi-layer caching that reduced origin hits by 90%, bringing p99
  latency under 100ms for cached tiles."

Design 5: Data Pipeline (Climate Risk Platform)

Another one of YOUR systems!

Requirements:
  - Ingest data from multiple external sources
  - Process multi-terabyte datasets
  - Near real-time insights for customers
  - Handle failures gracefully

High-Level Design:
  External Sources → Ingestion Service → Queue → Processing Workers (Dask)
                                                → PostgreSQL/PostGIS
                                                → API → Customer Dashboard

Key Decisions:
  1. Ingestion:
     - API scrapers, file downloaders, webhook receivers
     - Idempotent ingestion (same data processed twice = same result)
     - Raw data stored in S3 (archival + reprocessing)

  2. Processing:
     - Dask for distributed computation
     - Chunk large datasets into manageable pieces
     - Retry failed chunks independently

  3. Data quality:
     - Validation at ingestion
     - Schema checks before processing
     - Monitoring for data freshness

  4. Query optimization:
     - Spatial indexing (GiST indexes in PostGIS)
     - Partitioning by geography or time
     - Materialized views for frequent queries
     - Connection pooling (PgBouncer)

Your Experience as System Design Ammo

Map your resume to system design vocabulary:

Your Experience	System Design Concept
”Reduced latency from minutes to milliseconds”	Query optimization, indexing, caching
”1.8 TB, 2.3 trillion records in PostGIS”	Database scaling, spatial indexing, sharding
”Async workflows with queues, workers, retries”	Message queues, distributed task processing
”Multi-terabyte datasets, hundreds of users”	Horizontal scaling, caching, CDN
”5.3 TB tile service with caching”	CDN, multi-layer caching, read optimization
”Caching, monitoring, alerting”	Observability, reliability engineering
”AWS infrastructure”	Cloud architecture, auto-scaling, managed services
”Mentored junior engineers”	Leadership, technical communication

Common Trade-offs to Discuss

Always frame decisions as trade-offs in interviews:

Decision	Trade-off
SQL vs NoSQL	Consistency + queries vs Scale + flexibility
Cache vs No cache	Latency vs Complexity + stale data
Sync vs Async	Simplicity vs Throughput + resilience
Monolith vs Microservices	Simplicity vs Independent scaling + deployment
Strong vs Eventual consistency	Correctness vs Availability + latency
Push vs Pull	Real-time vs Resource efficiency
Pre-compute vs On-demand	Latency vs Storage + freshness
Normalize vs Denormalize	Write efficiency vs Read efficiency
Vertical vs Horizontal scaling	Simplicity vs Unlimited scale
REST vs gRPC	Simplicity + tooling vs Performance + typing
Batch vs Stream processing	Throughput vs Latency

Resources

Free

System Design Primer: https://github.com/donnemartin/system-design-primer
System Design Handbook: https://www.systemdesignhandbook.com/guides/system-design/
ByteByteGo Newsletter: https://blog.bytebytego.com/ (free articles)
High Scalability Blog: http://highscalability.com/
Martin Kleppmann’s talks: YouTube (DDIA author)

Books

Designing Data-Intensive Applications (DDIA) — you’re reading this!
System Design Interview Vol 1 & 2 by Alex Xu
Understanding Distributed Systems by Roberto Vitillo

Courses

Grokking System Design (Educative): https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers
DesignGurus.io: https://www.designgurus.io/

Practice

Practice with a friend — take turns being interviewer
Record yourself explaining a design (watch for clarity)
Time yourself — 35 minutes for the complete design

My Notes

Systems I can explain deeply:
- Climate risk platform (data pipeline, processing, API)
- Global building footprints (1.8TB PostGIS, spatial indexing)
- Map tile service (5.3TB, caching, MapServer)
- Async job processing (RabbitMQ, Dask, retries)

Concepts I need to review:
-

Trade-offs I always forget:
-

Next: 03-python-deep-dive.md

02 — System Design Interview Guide

Table of Contents

The System Design Interview Framework

Pro Tips for the Interview

Core Concepts You Must Know

1. Scalability

2. Load Balancing

3. Caching

4. Database Design

5. Database Scaling

6. Message Queues & Async Processing

7. CAP Theorem

8. Consistent Hashing

9. API Gateway

10. CDN (Content Delivery Network)

11. Rate Limiting

12. Replication & Consensus

Back-of-the-Envelope Estimation

Key Numbers to Memorize

Estimation Template

Top 20 System Design Problems

Tier 1: Must Know (very commonly asked)

Tier 2: Frequently Asked

Tier 3: Good to Know

Tier 4: Specialized (great for your profile)

Deep Dives: 5 Detailed Designs

Design 1: URL Shortener (TinyURL)

Design 2: Chat System (like Slack/WhatsApp)

Design 3: Task Queue / Background Job System

Design 4: Geospatial Service (Map Tile Server)

Design 5: Data Pipeline (Climate Risk Platform)

Your Experience as System Design Ammo

Common Trade-offs to Discuss

Resources

Free

Books

Courses

Practice

My Notes