05 — Backend & API Design Interview Guide

Priority: HIGH — This is your daily work. You need to articulate best practices fluently.


Table of Contents

  1. REST API Design
  2. API Authentication & Security
  3. API Performance & Reliability
  4. Backend Architecture Patterns
  5. Microservices vs Monolith
  6. Background Jobs & Queues
  7. Observability
  8. Common Interview Questions
  9. Resources

REST API Design

HTTP Methods & Semantics

GET    /users          → List users       (safe, idempotent, cacheable)
GET    /users/123      → Get user 123     (safe, idempotent, cacheable)
POST   /users          → Create user      (not idempotent)
PUT    /users/123      → Replace user 123 (idempotent — full replacement)
PATCH  /users/123      → Update user 123  (partial update)
DELETE /users/123      → Delete user 123  (idempotent)

Safe: doesn't modify state (GET, HEAD, OPTIONS)
Idempotent: same request N times = same result (GET, PUT, DELETE)
Non-idempotent: POST (each call may create a new resource)

URL Design Best Practices

✓ Use nouns, not verbs: /users (not /getUsers)
✓ Use plural: /users (not /user)
✓ Use kebab-case: /user-profiles (not /userProfiles)
✓ Nest for relationships: /users/123/orders
✓ Use query params for filtering: /users?role=admin&active=true
✓ Consistent naming: don't mix /get-users and /users/list

✗ Don't expose internal IDs if possible (use UUIDs or slugs)
✗ Don't nest too deep: /a/1/b/2/c/3/d/4 is too much (max 2-3 levels)
✗ Don't use query params for resource identification: /users?id=123

Status Codes

2xx Success:
  200 OK                — successful GET, PUT, PATCH, DELETE
  201 Created           — successful POST (return Location header)
  202 Accepted          — request accepted, processed async
  204 No Content        — successful DELETE (no response body)

3xx Redirection:
  301 Moved Permanently — SEO-friendly redirect (cached by browser)
  302 Found             — temporary redirect
  304 Not Modified      — conditional GET, use cached version

4xx Client Errors:
  400 Bad Request       — malformed request, validation failure
  401 Unauthorized      — not authenticated (missing/invalid token)
  403 Forbidden         — authenticated but not authorized
  404 Not Found         — resource doesn't exist
  405 Method Not Allowed — wrong HTTP method
  409 Conflict          — duplicate resource, version conflict
  422 Unprocessable     — valid syntax but semantic errors
  429 Too Many Requests — rate limited

5xx Server Errors:
  500 Internal Server Error — unexpected server error
  502 Bad Gateway          — upstream service failed
  503 Service Unavailable  — temporarily overloaded
  504 Gateway Timeout      — upstream service timeout

Pagination

Option 1: Offset-based (simple, most common)
  GET /users?page=2&per_page=20
  Response: { data: [...], total: 500, page: 2, per_page: 20, total_pages: 25 }
  
  Pros: Simple, supports jumping to any page
  Cons: Slow for large offsets (OFFSET 10000 still scans), inconsistent if data changes

Option 2: Cursor-based (better for large datasets)
  GET /users?cursor=eyJpZCI6MTIzfQ&limit=20
  Response: { data: [...], next_cursor: "eyJpZCI6MTQzfQ", has_more: true }
  
  Pros: Consistent, fast (uses WHERE id > cursor)
  Cons: Can't jump to arbitrary page, harder to implement

Option 3: Keyset pagination (similar to cursor, explicit)
  GET /users?after_id=123&limit=20
  → SELECT * FROM users WHERE id > 123 ORDER BY id LIMIT 20;
  
  Pros: Simple, efficient, uses index
  Cons: Only works with sortable, unique keys

Your experience: "I implemented cursor-based pagination for our APIs
serving hundreds of concurrent users, because offset-based pagination
became slow with large datasets. The cursor encodes the last-seen
primary key, enabling efficient indexed lookups."

Versioning

Option 1: URL path versioning (most common)
  GET /api/v1/users
  GET /api/v2/users
  Pros: Clear, easy to understand, easy to route
  Cons: URL changes, hard to deprecate

Option 2: Header versioning
  GET /api/users  Header: API-Version: 2
  Pros: Clean URLs
  Cons: Harder to test (can't bookmark/share), hidden

Option 3: Query parameter
  GET /api/users?version=2
  Pros: Simple to add
  Cons: Not RESTful

Recommendation: URL path versioning for most APIs (v1, v2).
Keep v1 working while building v2. Set deprecation timeline.

Error Response Format

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid input",
    "details": [
      {
        "field": "email",
        "message": "Must be a valid email address"
      },
      {
        "field": "age",
        "message": "Must be between 0 and 150"
      }
    ]
  }
}

API Authentication & Security

Authentication Methods

1. API Keys:
   - Simple, good for server-to-server
   - Sent in header: X-API-Key: abc123
   - Not suitable for user auth (no expiry, no scope)

2. JWT (JSON Web Tokens):
   - Stateless: token contains claims, signed by server
   - Structure: header.payload.signature (base64 encoded)
   - Access token: short-lived (15 min), carried in Authorization header
   - Refresh token: longer-lived (days), stored securely, used to get new access token
   - Verify: check signature + expiry + issuer
   - Revocation: tricky (token is stateless) — use short TTL + blacklist for critical cases

3. OAuth 2.0:
   - Delegated authorization framework
   - Flows: Authorization Code (web), Client Credentials (server-to-server),
     PKCE (SPAs/mobile)
   - Used for: "Login with Google", third-party API access

4. Session-based:
   - Server stores session in DB/Redis
   - Client sends session cookie
   - Stateful: server must look up session on every request
   - Simpler but doesn't scale horizontally as easily

Security Best Practices

1. Input validation:
   - Validate all input (type, length, format, range)
   - Use Pydantic models in FastAPI (automatic validation)
   - Sanitize to prevent SQL injection, XSS

2. Rate limiting:
   - Per-user/IP request limits (429 Too Many Requests)
   - Sliding window or token bucket algorithm
   - Implement at API gateway level

3. CORS:
   - Configure allowed origins, methods, headers
   - Don't use Access-Control-Allow-Origin: * in production

4. HTTPS everywhere:
   - SSL/TLS termination at load balancer
   - HSTS header to enforce HTTPS
   - Redirect HTTP → HTTPS

5. SQL injection prevention:
   - Parameterized queries (always!)
   - ORMs handle this (SQLAlchemy, Django ORM)
   - Never concatenate user input into SQL strings

6. Secrets management:
   - Environment variables (not in code)
   - AWS Secrets Manager, HashiCorp Vault
   - .env files for local development (never commit to git)

7. Logging:
   - Never log passwords, tokens, PII
   - Log request IDs for tracing
   - Sanitize error messages in responses (don't expose stack traces)

API Performance & Reliability

Caching Strategies

HTTP Caching:
  - Cache-Control: max-age=3600 (browser caches for 1 hour)
  - ETag: "abc123" → conditional GET with If-None-Match → 304 Not Modified
  - Varies: Accept-Encoding (cache per encoding)

Application Caching:
  - Redis for API response caching
  - Cache key: hash of request parameters
  - TTL based on data freshness requirements
  - Cache invalidation: TTL, event-driven, or hybrid

CDN Caching:
  - Static assets: aggressive caching with content hashing in URLs
  - API responses: short TTL for dynamic data
  - Edge caching for geographically distributed users

Resilience Patterns

1. Retry with Exponential Backoff:
   - 1st retry: 1s, 2nd: 2s, 3rd: 4s, 4th: 8s
   - Add jitter (random ±20%) to avoid thundering herd
   - Set max retries (3-5)
   - Only retry on transient errors (5xx, timeout), not 4xx

2. Circuit Breaker:
   - States: Closed (normal) → Open (failing, reject calls) → Half-Open (test)
   - Tracks failure rate over a time window
   - Prevents cascading failures when downstream service is down
   - Tools: tenacity (Python), custom implementation

3. Timeout:
   - Always set timeouts on external calls
   - Connection timeout (short: 1-5s)
   - Read timeout (varies: 5-30s)
   - Overall request timeout at API gateway

4. Bulkhead:
   - Isolate different services/endpoints
   - Failure in one doesn't exhaust resources for others
   - Thread pools, connection pools, queue limits

5. Graceful Degradation:
   - Return cached data when service is down
   - Reduce feature set under load
   - Return partial results instead of failing entirely

6. Health Checks:
   - /health: basic service health
   - /health/ready: dependencies healthy (DB, cache, etc.)
   - Used by load balancers and orchestrators

Request ID / Distributed Tracing

Every request gets a unique ID (UUID):
  - Passed in header: X-Request-ID
  - Included in all log entries
  - Passed to downstream services
  - Enables tracing a request across services

Tools: OpenTelemetry, Jaeger, Zipkin, AWS X-Ray

Backend Architecture Patterns

Layered Architecture

Controller / Route Layer
  → Handles HTTP request/response
  → Input validation (Pydantic models)
  → Authentication check
  
Service Layer
  → Business logic
  → Orchestrates repositories and external services
  → Transaction management
  
Repository / Data Access Layer
  → Database queries
  → Data mapping (ORM ↔ domain objects)
  → Caching logic
  
Infrastructure Layer
  → Database connections
  → External API clients
  → Message queue producers/consumers

Dependency Injection

# FastAPI's DI pattern
from fastapi import Depends

class UserService:
    def __init__(self, db: AsyncSession, cache: Redis):
        self.db = db
        self.cache = cache

    async def get_user(self, user_id: int) -> User:
        cached = await self.cache.get(f"user:{user_id}")
        if cached:
            return User.parse_raw(cached)
        user = await self.db.get(User, user_id)
        if user:
            await self.cache.set(f"user:{user_id}", user.json(), ex=300)
        return user

async def get_user_service(
    db: AsyncSession = Depends(get_db),
    cache: Redis = Depends(get_redis),
) -> UserService:
    return UserService(db, cache)

@app.get("/users/{user_id}")
async def get_user(user_id: int, service: UserService = Depends(get_user_service)):
    return await service.get_user(user_id)

# Benefits:
# - Testable (mock dependencies)
# - Loosely coupled
# - FastAPI manages lifecycle (yield for cleanup)

Microservices vs Monolith

Monolith:
  Pros: Simple, easy to develop/test/deploy, no network calls
  Cons: Hard to scale independently, long deployment cycles, team coupling
  When: Early-stage startup, small team, unclear boundaries

Microservices:
  Pros: Independent scaling/deployment, team autonomy, tech diversity
  Cons: Network complexity, distributed transactions, operational overhead
  When: Clear service boundaries, multiple teams, specific scaling needs

Your answer: "At Intensel, we have a modular monolith with clear service
boundaries internally. For compute-intensive jobs, we separated the
processing workers (Dask) from the API service, connected via RabbitMQ.
This gave us independent scaling for the processing layer without the full
microservices overhead."

Monolith → Microservices path:
  1. Start with a monolith (startup speed)
  2. Identify natural boundaries as you grow
  3. Extract services one at a time (Strangler Fig pattern)
  4. Use API gateway to route between old and new

Communication Patterns

Synchronous:
  - REST: simple, well-understood, HTTP-based
  - gRPC: binary protocol, faster, strong typing (Protobuf)
  - When: request-response pattern, need immediate answer

Asynchronous:
  - Message queues (RabbitMQ, SQS): work items, task queues
  - Event streaming (Kafka): event log, replay-ability, high throughput
  - Pub/Sub (Redis, SNS): broadcast events to multiple consumers
  - When: decouple services, handle spikes, fire-and-forget

Background Jobs & Queues

You have deep experience here. Articulate it well.

Architecture

API Service → Job Submission → Queue (RabbitMQ/SQS) → Workers → Result Store

Job Lifecycle:
  SUBMITTED → QUEUED → PICKED_UP → RUNNING → SUCCESS / FAILED / RETRY

Key Design Decisions:
  1. Job serialization: JSON payload in queue message
  2. Idempotency: same job processed twice = same outcome
  3. Priority queues: critical jobs → high-priority queue
  4. Dead Letter Queue (DLQ): permanently failed jobs for investigation
  5. Backpressure: don't accept jobs faster than workers can process

Retry Strategies

# Exponential backoff with jitter
import random

def get_retry_delay(attempt: int, base: float = 1.0, max_delay: float = 60.0) -> float:
    delay = min(base * (2 ** attempt), max_delay)
    jitter = delay * random.uniform(0.8, 1.2)
    return jitter

# attempt 0: ~1s
# attempt 1: ~2s
# attempt 2: ~4s
# attempt 3: ~8s
# attempt 4: ~16s (capped at 60s)

Job Queue Pattern (Your Experience)

Your story for interviews:
"I designed the async processing system at Intensel using RabbitMQ for
message queuing and Dask for distributed computation. Key design choices:

1. Separation of concerns: API just enqueues jobs, workers process them
2. Priority queues: climate risk assessments (customer-facing) get priority
   over batch data ingestion
3. Retry with exponential backoff: transient failures (network, API limits)
   retry automatically, permanent failures go to DLQ
4. Scheduling: cron-like scheduler submits periodic data refresh jobs
5. Monitoring: queue depth alerts, worker health checks, job duration tracking
6. Idempotency: jobs can be safely retried — processing uses upserts and
   checkpoints

This let us process multi-terabyte datasets while keeping APIs responsive
for hundreds of concurrent users."

Observability

The Three Pillars

1. Logging:
   - Structured logs (JSON): {"level": "INFO", "request_id": "abc", "msg": "..."}
   - Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
   - Include: request_id, user_id, action, duration, error details
   - Tools: ELK stack, CloudWatch Logs, Datadog

2. Metrics:
   - Counters: total requests, errors, jobs processed
   - Gauges: active connections, queue depth, memory usage
   - Histograms: request latency distribution, response sizes
   - Key metrics: p50, p95, p99 latency, error rate, throughput
   - Tools: Prometheus + Grafana, CloudWatch Metrics, Datadog

3. Tracing:
   - Follow a request across services
   - Spans: start time, duration, parent span
   - Trace ID propagated in headers
   - Tools: OpenTelemetry, Jaeger, AWS X-Ray

Alerting:
  - Error rate > 1% for 5 minutes
  - p99 latency > 2s for 5 minutes
  - Queue depth growing (consumers falling behind)
  - Disk usage > 80%
  - Health check failures

Common Interview Questions

Q: How would you design an API for a new feature?
A: 1. Understand requirements (who calls it, how often, what data)
   2. Define resource and endpoints (REST conventions)
   3. Define request/response schemas (Pydantic models)
   4. Choose auth method (JWT for users, API keys for services)
   5. Add pagination, filtering, sorting
   6. Plan error handling (consistent error responses)
   7. Document with OpenAPI (auto-generated in FastAPI)
   8. Add rate limiting, caching where appropriate

Q: How do you handle breaking API changes?
A: Version the API (URL versioning), maintain both versions during
   migration period, document deprecation timeline, provide migration
   guide, set sunset header, monitor old version usage.

Q: How do you ensure API reliability?
A: Rate limiting, circuit breakers, retries with backoff, timeouts,
   health checks, graceful degradation, caching fallbacks, monitoring
   with alerting, load testing, chaos testing.

Q: REST vs GraphQL?
A: REST: simple, well-understood, cacheable, good for resource-oriented APIs.
   GraphQL: flexible queries (no over/under-fetching), good for complex UIs
   with varied data needs. REST for most backends; GraphQL if clients need
   very different data shapes from same backend.

Q: How do you handle file uploads?
A: Small files: multipart form data directly to API → S3.
   Large files: presigned S3 URL → client uploads directly to S3 →
   webhook/notification to backend to process.

Q: What is idempotency and why does it matter?
A: Making the same request N times produces the same result.
   Matters for: retries (network failures), webhook delivery,
   payment processing. Implement with: idempotency keys (client-sent
   unique ID), upserts, checking before creating.

Resources


My Notes

APIs I've designed (for interview stories):
-

Architecture decisions I can explain:
-

Patterns I use daily:
-

Next: 06-distributed-systems.md