Loading…
Loading…
Add the operations your service exposes. Method, path, and status codes make your API much easier to review.
Policies that aren't specific to a single endpoint — auth, rate limits, versioning, and other notes.
Diagram
Draw the components and how traffic flows between them. Notes below the canvas are optional — the Walkthrough panel is the primary place to narrate flows.
Request walkthrough
Trace each core requirement as an ordered sequence of hops through your diagram. Use component names from your canvas for the From / To columns.
Given (subject, route), return allow / deny + retry-after metadata.
Limits are configurable per subject (user / API key / IP) and per route.
Support multiple windows simultaneously (e.g. 10/sec AND 100/min).
Storage schema
For each entity, declare how it's stored. Sharding key is the interesting one — pick the access pattern it optimises for.
The thing being rate-limited (user, API key, or IP).
Rule binding: (subject_type, route) → (limit, window, algorithm).
Per-subject per-window counter state, key = (subject, route, window_start).
Result of one check call: { allowed, remaining, reset_at, retry_after }.
Component choices
Pick one per row and give a one-line reason. These are the concrete technology decisions your diagram implies.
Where the decision is made — at the edge or a shared service.
Where per-subject state lives (e.g. rate-limit counters).
The core decision algorithm for this problem.
How traffic is distributed to your app servers.
Your diagram
No components drawn yet — edit the diagram before answering.
Iterate on your design — don't start over.
Each scenario below probes a specific weakness in a typical HLD. Reference components from your diagram by name, describe what breaks and at what load, then name the minimum change that fixes it. Strong answers identify the precise failure mode — not just "scale it up".
A celebrity tweets a link and 100K req/sec all hit the same rate-limit key for that tweet. Your shard for that key melts. What do you do?
Probes: sharding partitioning
You deploy in 3 regions, each with its own Redis. The customer says "100 requests per minute global". What is your honest answer?
Probes: consistency tradeoffs