05 — Backend & API Design Interview Guide
Priority: HIGH — This is your daily work. You need to articulate best practices fluently.
Table of Contents
- REST API Design
- API Authentication & Security
- API Performance & Reliability
- Backend Architecture Patterns
- Microservices vs Monolith
- Background Jobs & Queues
- Observability
- Common Interview Questions
- Resources
REST API Design
HTTP Methods & Semantics
GET /users → List users (safe, idempotent, cacheable)
GET /users/123 → Get user 123 (safe, idempotent, cacheable)
POST /users → Create user (not idempotent)
PUT /users/123 → Replace user 123 (idempotent — full replacement)
PATCH /users/123 → Update user 123 (partial update)
DELETE /users/123 → Delete user 123 (idempotent)
Safe: doesn't modify state (GET, HEAD, OPTIONS)
Idempotent: same request N times = same result (GET, PUT, DELETE)
Non-idempotent: POST (each call may create a new resource)
URL Design Best Practices
✓ Use nouns, not verbs: /users (not /getUsers)
✓ Use plural: /users (not /user)
✓ Use kebab-case: /user-profiles (not /userProfiles)
✓ Nest for relationships: /users/123/orders
✓ Use query params for filtering: /users?role=admin&active=true
✓ Consistent naming: don't mix /get-users and /users/list
✗ Don't expose internal IDs if possible (use UUIDs or slugs)
✗ Don't nest too deep: /a/1/b/2/c/3/d/4 is too much (max 2-3 levels)
✗ Don't use query params for resource identification: /users?id=123
Status Codes
2xx Success:
200 OK — successful GET, PUT, PATCH, DELETE
201 Created — successful POST (return Location header)
202 Accepted — request accepted, processed async
204 No Content — successful DELETE (no response body)
3xx Redirection:
301 Moved Permanently — SEO-friendly redirect (cached by browser)
302 Found — temporary redirect
304 Not Modified — conditional GET, use cached version
4xx Client Errors:
400 Bad Request — malformed request, validation failure
401 Unauthorized — not authenticated (missing/invalid token)
403 Forbidden — authenticated but not authorized
404 Not Found — resource doesn't exist
405 Method Not Allowed — wrong HTTP method
409 Conflict — duplicate resource, version conflict
422 Unprocessable — valid syntax but semantic errors
429 Too Many Requests — rate limited
5xx Server Errors:
500 Internal Server Error — unexpected server error
502 Bad Gateway — upstream service failed
503 Service Unavailable — temporarily overloaded
504 Gateway Timeout — upstream service timeout
Pagination
Option 1: Offset-based (simple, most common)
GET /users?page=2&per_page=20
Response: { data: [...], total: 500, page: 2, per_page: 20, total_pages: 25 }
Pros: Simple, supports jumping to any page
Cons: Slow for large offsets (OFFSET 10000 still scans), inconsistent if data changes
Option 2: Cursor-based (better for large datasets)
GET /users?cursor=eyJpZCI6MTIzfQ&limit=20
Response: { data: [...], next_cursor: "eyJpZCI6MTQzfQ", has_more: true }
Pros: Consistent, fast (uses WHERE id > cursor)
Cons: Can't jump to arbitrary page, harder to implement
Option 3: Keyset pagination (similar to cursor, explicit)
GET /users?after_id=123&limit=20
→ SELECT * FROM users WHERE id > 123 ORDER BY id LIMIT 20;
Pros: Simple, efficient, uses index
Cons: Only works with sortable, unique keys
Your experience: "I implemented cursor-based pagination for our APIs
serving hundreds of concurrent users, because offset-based pagination
became slow with large datasets. The cursor encodes the last-seen
primary key, enabling efficient indexed lookups."
Versioning
Option 1: URL path versioning (most common)
GET /api/v1/users
GET /api/v2/users
Pros: Clear, easy to understand, easy to route
Cons: URL changes, hard to deprecate
Option 2: Header versioning
GET /api/users Header: API-Version: 2
Pros: Clean URLs
Cons: Harder to test (can't bookmark/share), hidden
Option 3: Query parameter
GET /api/users?version=2
Pros: Simple to add
Cons: Not RESTful
Recommendation: URL path versioning for most APIs (v1, v2).
Keep v1 working while building v2. Set deprecation timeline.
Error Response Format
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid input",
"details": [
{
"field": "email",
"message": "Must be a valid email address"
},
{
"field": "age",
"message": "Must be between 0 and 150"
}
]
}
}
API Authentication & Security
Authentication Methods
1. API Keys:
- Simple, good for server-to-server
- Sent in header: X-API-Key: abc123
- Not suitable for user auth (no expiry, no scope)
2. JWT (JSON Web Tokens):
- Stateless: token contains claims, signed by server
- Structure: header.payload.signature (base64 encoded)
- Access token: short-lived (15 min), carried in Authorization header
- Refresh token: longer-lived (days), stored securely, used to get new access token
- Verify: check signature + expiry + issuer
- Revocation: tricky (token is stateless) — use short TTL + blacklist for critical cases
3. OAuth 2.0:
- Delegated authorization framework
- Flows: Authorization Code (web), Client Credentials (server-to-server),
PKCE (SPAs/mobile)
- Used for: "Login with Google", third-party API access
4. Session-based:
- Server stores session in DB/Redis
- Client sends session cookie
- Stateful: server must look up session on every request
- Simpler but doesn't scale horizontally as easily
Security Best Practices
1. Input validation:
- Validate all input (type, length, format, range)
- Use Pydantic models in FastAPI (automatic validation)
- Sanitize to prevent SQL injection, XSS
2. Rate limiting:
- Per-user/IP request limits (429 Too Many Requests)
- Sliding window or token bucket algorithm
- Implement at API gateway level
3. CORS:
- Configure allowed origins, methods, headers
- Don't use Access-Control-Allow-Origin: * in production
4. HTTPS everywhere:
- SSL/TLS termination at load balancer
- HSTS header to enforce HTTPS
- Redirect HTTP → HTTPS
5. SQL injection prevention:
- Parameterized queries (always!)
- ORMs handle this (SQLAlchemy, Django ORM)
- Never concatenate user input into SQL strings
6. Secrets management:
- Environment variables (not in code)
- AWS Secrets Manager, HashiCorp Vault
- .env files for local development (never commit to git)
7. Logging:
- Never log passwords, tokens, PII
- Log request IDs for tracing
- Sanitize error messages in responses (don't expose stack traces)
API Performance & Reliability
Caching Strategies
HTTP Caching:
- Cache-Control: max-age=3600 (browser caches for 1 hour)
- ETag: "abc123" → conditional GET with If-None-Match → 304 Not Modified
- Varies: Accept-Encoding (cache per encoding)
Application Caching:
- Redis for API response caching
- Cache key: hash of request parameters
- TTL based on data freshness requirements
- Cache invalidation: TTL, event-driven, or hybrid
CDN Caching:
- Static assets: aggressive caching with content hashing in URLs
- API responses: short TTL for dynamic data
- Edge caching for geographically distributed users
Resilience Patterns
1. Retry with Exponential Backoff:
- 1st retry: 1s, 2nd: 2s, 3rd: 4s, 4th: 8s
- Add jitter (random ±20%) to avoid thundering herd
- Set max retries (3-5)
- Only retry on transient errors (5xx, timeout), not 4xx
2. Circuit Breaker:
- States: Closed (normal) → Open (failing, reject calls) → Half-Open (test)
- Tracks failure rate over a time window
- Prevents cascading failures when downstream service is down
- Tools: tenacity (Python), custom implementation
3. Timeout:
- Always set timeouts on external calls
- Connection timeout (short: 1-5s)
- Read timeout (varies: 5-30s)
- Overall request timeout at API gateway
4. Bulkhead:
- Isolate different services/endpoints
- Failure in one doesn't exhaust resources for others
- Thread pools, connection pools, queue limits
5. Graceful Degradation:
- Return cached data when service is down
- Reduce feature set under load
- Return partial results instead of failing entirely
6. Health Checks:
- /health: basic service health
- /health/ready: dependencies healthy (DB, cache, etc.)
- Used by load balancers and orchestrators
Request ID / Distributed Tracing
Every request gets a unique ID (UUID):
- Passed in header: X-Request-ID
- Included in all log entries
- Passed to downstream services
- Enables tracing a request across services
Tools: OpenTelemetry, Jaeger, Zipkin, AWS X-Ray
Backend Architecture Patterns
Layered Architecture
Controller / Route Layer
→ Handles HTTP request/response
→ Input validation (Pydantic models)
→ Authentication check
Service Layer
→ Business logic
→ Orchestrates repositories and external services
→ Transaction management
Repository / Data Access Layer
→ Database queries
→ Data mapping (ORM ↔ domain objects)
→ Caching logic
Infrastructure Layer
→ Database connections
→ External API clients
→ Message queue producers/consumers
Dependency Injection
# FastAPI's DI pattern
from fastapi import Depends
class UserService:
def __init__(self, db: AsyncSession, cache: Redis):
self.db = db
self.cache = cache
async def get_user(self, user_id: int) -> User:
cached = await self.cache.get(f"user:{user_id}")
if cached:
return User.parse_raw(cached)
user = await self.db.get(User, user_id)
if user:
await self.cache.set(f"user:{user_id}", user.json(), ex=300)
return user
async def get_user_service(
db: AsyncSession = Depends(get_db),
cache: Redis = Depends(get_redis),
) -> UserService:
return UserService(db, cache)
@app.get("/users/{user_id}")
async def get_user(user_id: int, service: UserService = Depends(get_user_service)):
return await service.get_user(user_id)
# Benefits:
# - Testable (mock dependencies)
# - Loosely coupled
# - FastAPI manages lifecycle (yield for cleanup)
Microservices vs Monolith
Monolith:
Pros: Simple, easy to develop/test/deploy, no network calls
Cons: Hard to scale independently, long deployment cycles, team coupling
When: Early-stage startup, small team, unclear boundaries
Microservices:
Pros: Independent scaling/deployment, team autonomy, tech diversity
Cons: Network complexity, distributed transactions, operational overhead
When: Clear service boundaries, multiple teams, specific scaling needs
Your answer: "At Intensel, we have a modular monolith with clear service
boundaries internally. For compute-intensive jobs, we separated the
processing workers (Dask) from the API service, connected via RabbitMQ.
This gave us independent scaling for the processing layer without the full
microservices overhead."
Monolith → Microservices path:
1. Start with a monolith (startup speed)
2. Identify natural boundaries as you grow
3. Extract services one at a time (Strangler Fig pattern)
4. Use API gateway to route between old and new
Communication Patterns
Synchronous:
- REST: simple, well-understood, HTTP-based
- gRPC: binary protocol, faster, strong typing (Protobuf)
- When: request-response pattern, need immediate answer
Asynchronous:
- Message queues (RabbitMQ, SQS): work items, task queues
- Event streaming (Kafka): event log, replay-ability, high throughput
- Pub/Sub (Redis, SNS): broadcast events to multiple consumers
- When: decouple services, handle spikes, fire-and-forget
Background Jobs & Queues
You have deep experience here. Articulate it well.
Architecture
API Service → Job Submission → Queue (RabbitMQ/SQS) → Workers → Result Store
Job Lifecycle:
SUBMITTED → QUEUED → PICKED_UP → RUNNING → SUCCESS / FAILED / RETRY
Key Design Decisions:
1. Job serialization: JSON payload in queue message
2. Idempotency: same job processed twice = same outcome
3. Priority queues: critical jobs → high-priority queue
4. Dead Letter Queue (DLQ): permanently failed jobs for investigation
5. Backpressure: don't accept jobs faster than workers can process
Retry Strategies
# Exponential backoff with jitter
import random
def get_retry_delay(attempt: int, base: float = 1.0, max_delay: float = 60.0) -> float:
delay = min(base * (2 ** attempt), max_delay)
jitter = delay * random.uniform(0.8, 1.2)
return jitter
# attempt 0: ~1s
# attempt 1: ~2s
# attempt 2: ~4s
# attempt 3: ~8s
# attempt 4: ~16s (capped at 60s)
Job Queue Pattern (Your Experience)
Your story for interviews:
"I designed the async processing system at Intensel using RabbitMQ for
message queuing and Dask for distributed computation. Key design choices:
1. Separation of concerns: API just enqueues jobs, workers process them
2. Priority queues: climate risk assessments (customer-facing) get priority
over batch data ingestion
3. Retry with exponential backoff: transient failures (network, API limits)
retry automatically, permanent failures go to DLQ
4. Scheduling: cron-like scheduler submits periodic data refresh jobs
5. Monitoring: queue depth alerts, worker health checks, job duration tracking
6. Idempotency: jobs can be safely retried — processing uses upserts and
checkpoints
This let us process multi-terabyte datasets while keeping APIs responsive
for hundreds of concurrent users."
Observability
The Three Pillars
1. Logging:
- Structured logs (JSON): {"level": "INFO", "request_id": "abc", "msg": "..."}
- Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
- Include: request_id, user_id, action, duration, error details
- Tools: ELK stack, CloudWatch Logs, Datadog
2. Metrics:
- Counters: total requests, errors, jobs processed
- Gauges: active connections, queue depth, memory usage
- Histograms: request latency distribution, response sizes
- Key metrics: p50, p95, p99 latency, error rate, throughput
- Tools: Prometheus + Grafana, CloudWatch Metrics, Datadog
3. Tracing:
- Follow a request across services
- Spans: start time, duration, parent span
- Trace ID propagated in headers
- Tools: OpenTelemetry, Jaeger, AWS X-Ray
Alerting:
- Error rate > 1% for 5 minutes
- p99 latency > 2s for 5 minutes
- Queue depth growing (consumers falling behind)
- Disk usage > 80%
- Health check failures
Common Interview Questions
Q: How would you design an API for a new feature?
A: 1. Understand requirements (who calls it, how often, what data)
2. Define resource and endpoints (REST conventions)
3. Define request/response schemas (Pydantic models)
4. Choose auth method (JWT for users, API keys for services)
5. Add pagination, filtering, sorting
6. Plan error handling (consistent error responses)
7. Document with OpenAPI (auto-generated in FastAPI)
8. Add rate limiting, caching where appropriate
Q: How do you handle breaking API changes?
A: Version the API (URL versioning), maintain both versions during
migration period, document deprecation timeline, provide migration
guide, set sunset header, monitor old version usage.
Q: How do you ensure API reliability?
A: Rate limiting, circuit breakers, retries with backoff, timeouts,
health checks, graceful degradation, caching fallbacks, monitoring
with alerting, load testing, chaos testing.
Q: REST vs GraphQL?
A: REST: simple, well-understood, cacheable, good for resource-oriented APIs.
GraphQL: flexible queries (no over/under-fetching), good for complex UIs
with varied data needs. REST for most backends; GraphQL if clients need
very different data shapes from same backend.
Q: How do you handle file uploads?
A: Small files: multipart form data directly to API → S3.
Large files: presigned S3 URL → client uploads directly to S3 →
webhook/notification to backend to process.
Q: What is idempotency and why does it matter?
A: Making the same request N times produces the same result.
Matters for: retries (network failures), webhook delivery,
payment processing. Implement with: idempotency keys (client-sent
unique ID), upserts, checking before creating.
Resources
- RESTful API Design: https://restfulapi.net/
- HTTP Status Codes: https://httpstatuses.com/
- API Design Guide (Google): https://cloud.google.com/apis/design
- FastAPI Best Practices: https://github.com/zhanymkanov/fastapi-best-practices
- Microservices Patterns by Chris Richardson
- Building Microservices by Sam Newman
My Notes
APIs I've designed (for interview stories):
-
Architecture decisions I can explain:
-
Patterns I use daily:
-