Data-intensive systems
Storage choice, partitioning, replication, indexing. Everything it takes to make data layer decisions that survive review.
For: Engineers prepping for data-heavy prompts (search, analytics, feeds, storage products)
After this path
Name the right store, the right partition key, the right indexes, and the right replication mode — and defend each.
- 1Skill
Data model design
"We'll put it in Postgres" is not a data model. The data model is entities, keys, relationships, cardinalities, and the access patterns each one has to serve — and it locks in every trade-off you will chase for the rest of the design.
Why this, here: The entity model decides every downstream choice.
- 2Skill
Storage choice justification
Picking a database is a first-principles decision, not a defaults one. "We use Postgres" is a cultural statement; "the access pattern is point-lookup at 100k QPS with eventual consistency, so we use DynamoDB" is a design.
Why this, here: SQL vs KV vs wide-column vs search — the framework for picking.
- 3Skill
Sharding & partitioning
The partition key is the single most consequential decision in a distributed data design. Pick it wrong and no amount of horsepower recovers you — the hot shard stays hot, the rebalance never finishes, and the team spends a quarter migrating.
Why this, here: The single most consequential call in distributed data.
Checkpoint
Defend a partition key for a multi-tenant SaaS’s events table. Now describe the hot partition that eventually appears and how you’d reshard without downtime. If the reshard story is blank, the call wasn’t load-bearing yet.
- 4Skill
indexing-strategies
The index you picked three months ago decides your query latency today — and the one you didn't create decides which queries you can't ship. Indexing is not "add indexes until it's fast"; it's a first-principles match between query shape and index structure.
Why this, here: B-tree vs LSM vs inverted. First-principles choice.
- 5Skill
Replication & durability
Replication is how you survive a node death; durability is how you survive a bad deploy. Candidates confuse the two and end up with a design that's highly available but cheerfully corrupt.
Why this, here: Quorum, sync vs async, RPO / RTO.
Checkpoint
For a payments store: sync replication to N replicas or async to 1? State the RPO you accept and the failure that forces the trade-off. Staff-plus candidates name the number.
- 6Skill
cdc-eventing
Change-data-capture is how you keep read models fresh without dual writes. The DB's log is already your most reliable event stream — use it.
Why this, here: Keep derived stores in sync without dual writes.
- 7Pattern
Search over content
Inverted index + ranking service. The hard part isn't indexing — it's relevance and keeping the index fresh.
Why this, here: The canonical derived-view pattern.