Loading…
Loading…
A pattern is a system shape — read-heavy, write-heavy, fan-out, long-running, real-time. Interviewers recognise these on sight and expect you to pattern-match fast. Each write-up covers when you reach for it, the canonical skeleton, a scaling path, and the failure modes that kill it.
How traffic is distributed between reads and writes drives every other decision.
Reads dominate writes by 10:1 or more. Every layer exists to keep the primary out of the hot path.
You see it when: The interviewer states or implies a read:write ratio of 10:1, 100:1, or higher
Writes arrive faster than any single node can persist them synchronously. The design is about absorbing, spreading, and deferring them.
You see it when: Write QPS is 10K+ per region and climbing
The boring default. Synchronous HTTP with a cache. Works for 70% of APIs and you should say so.
You see it when: Standard CRUD APIs — user profile, settings, admin consoles
Synchronous, asynchronous, real-time — how work flows through the system.
Accept work with 202 + job_id, process asynchronously, and let clients track progress via poll, push, or webhook.
You see it when: Processing takes 10 seconds to hours — far longer than HTTP timeout budgets
Decouple rate of production from rate of consumption with a durable queue and autoscaled workers.
You see it when: Synchronous call does work the user does not need to wait on (email send, image resize, index update)
Coordinate multi-service workflows via compensating transactions instead of distributed locks — choreography for simple flows, orchestration for everything else.
You see it when: Multi-service workflow (order → payment → inventory → shipping)
Poll, long-poll, SSE, or WebSocket — the choice is about update frequency, direction, and how many persistent connections you can afford.
You see it when: Server-initiated updates to the client
One-to-many delivery, geographic distribution, and the trade-offs that come with them.
Where does the work live — at write time (push to every follower's inbox) or at read time (gather from each followed user)? Both break at the extremes; hybrids win.
You see it when: Social graph with asymmetric follow counts (power-law follower distribution)
The cheapest request is the one that never hits your origin. Push static and near-static content to the edge and let the CDN absorb 80–99% of reads.
You see it when: Public, shareable content (product pages, articles, media)
Geographic distribution for latency, DR, and compliance. Active-passive is operationally sane; active-active is a conflict-resolution project.
You see it when: Global user base with regional latency SLOs (<100 ms)
Problem-specific patterns you'll recognise on sight once you've seen them.
Inverted index + ranking service. The hard part isn't indexing — it's relevance and keeping the index fresh.
You see it when: Full-text search over a corpus (millions+ of docs)
Geohash / S2 / H3 for "nearby X" queries. Straight lat/lng on a B-tree dies at a few thousand rows.
You see it when: "Find N nearest" queries
Chunked + resumable uploads direct to blob storage. Signed URLs. App server never touches the bytes.
You see it when: User-uploaded media (video, images, audio)
Feature store + candidate generation + ranking. Offline training, online serving. Separate the cheap retrieval from the expensive scoring.
You see it when: "Recommended for you" feeds
Token bucket + distributed counter. The token bucket math is trivial; the distribution is the hard part.
You see it when: Public API with tiered plans (free, paid)
The disciplines that turn "it works" into "it stays working".
Redundancy + graceful degradation + operational discipline. You don't buy 99.99% — you earn it.
You see it when: Availability target >= 99.95% (4 hours downtime/year or less)
Raft / Paxos via etcd / ZooKeeper. When exactly-one-of-N must do the thing, use a consensus service — don't roll your own.
You see it when: Distributed locks