Design: Design a Chat System (WhatsApp)

Endpoints

Add the operations your service exposes. Method, path, and status codes make your API much easier to review.

Start with a template

Applies to all endpoints

Policies that aren't specific to a single endpoint — auth, rate limits, versioning, and other notes.

Authentication

Rate limiting

Versioning

Notes (idempotency, redirect choice, error shape, anything else)

Diagram

Draw the components and how traffic flows between them. Notes below the canvas are optional — the Walkthrough panel is the primary place to narrate flows.

Draw the architecture so the components and connections tell the core story; flow narration in Notes is optional but recommended.

Components

Click palette â†’ add

Drag edge dot â†’ connect

Double-click node/edge â†’ rename

Shift+drag or box-select â†’ multi

Click any component on the left to add it here.

Drag the edge of a node to connect it to another.

Double-click any arrow to label it.

Notes (optional) — flow narration, trade-offs, chosen alternatives0 elements on canvas

Request walkthrough

Trace each core requirement as an ordered sequence of hops through your diagram. Use component names from your canvas for the From / To columns.

1

Users can send and receive text messages in 1:1 conversations.

FromToAction / payload

1.

2.

2

Users can create and participate in group chats (up to 100 members).

FromToAction / payload

1.

2.

3

Messages sent while a user is offline are delivered when they reconnect (up to 30 days).

FromToAction / payload

1.

2.

4

Users can send and receive media (images, files) in messages.

FromToAction / payload

1.

2.

Storage schema

For each entity, declare how it's stored. Sharding key is the interesting one — pick the access pattern it optimises for.

User

A registered user with one or more connected devices.

In-memory / derived

Storage type

Primary key

Sharding / partition key

Critical fields

Notes (indexes, TTL, access pattern)

Chat

A conversation — either 1:1 or group — with participant list and metadata.

In-memory / derived

Storage type

Primary key

Sharding / partition key

Critical fields

Notes (indexes, TTL, access pattern)

Message

A text or media payload sent within a chat, with sender, timestamp, and delivery status.

In-memory / derived

Storage type

Primary key

Sharding / partition key

Critical fields

Notes (indexes, TTL, access pattern)

Client

A specific device/session for a user (phone, tablet, desktop). A user may have multiple active clients.

In-memory / derived

Storage type

Primary key

Sharding / partition key

Critical fields

Notes (indexes, TTL, access pattern)

Component choices

Pick one per row and give a one-line reason. These are the concrete technology decisions your diagram implies.

Load Balancer

How traffic is distributed to your app servers.

Cache

Where hot reads are served from.

Queue / Stream

Async work buffer for writes/fan-out.

Database

Primary durable store for entities.

State Storage

Where per-subject state lives (e.g. rate-limit counters).

Worker / Dispatcher Pool

The async worker tier that drains the queue and does the slow work (delivery, fan-out, embedding).

Your diagram

No components drawn yet — edit the diagram before answering.

Iterate on your design — don't start over.

Each scenario below probes a specific weakness in a typical HLD. Reference components from your diagram by name, describe what breaks and at what load, then name the minimum change that fixes it. Strong answers identify the precise failure mode — not just "scale it up".

1

A chat server holding 100K connections crashes. Walk me through what happens to those 100K users.

Probes: failure mode analysis

Your answer

2

If you deploy chat servers in US, EU, and Asia, how do you handle a message from a US user to an EU user?

Probes: consistency tradeoffs

Your answer

Design a Chat System (WhatsApp)

API & core entities

Endpoints

Applies to all endpoints

High-level design

Deep dives