Weak on data modelling
Targeted path for engineers whose designs are strong on compute and weak on storage. Covers how to pick the right store and partition it right.
For: Engineers whose feedback often cites "data model unclear" or "why that database?"
After this path
Pick the right store, the right partition key, and the right indexes for a given prompt — with defensible reasoning.
- 1Skill
Data model design
"We'll put it in Postgres" is not a data model. The data model is entities, keys, relationships, cardinalities, and the access patterns each one has to serve — and it locks in every trade-off you will chase for the rest of the design.
Why this, here: Nail the entity model before you touch storage. Junior candidates skip this; seniors don't.
- 2Skill
Storage choice justification
Picking a database is a first-principles decision, not a defaults one. "We use Postgres" is a cultural statement; "the access pattern is point-lookup at 100k QPS with eventual consistency, so we use DynamoDB" is a design.
Why this, here: Match access pattern to storage shape — not defaults, not "we use Postgres".
- 3Skill
Sharding & partitioning
The partition key is the single most consequential decision in a distributed data design. Pick it wrong and no amount of horsepower recovers you — the hot shard stays hot, the rebalance never finishes, and the team spends a quarter migrating.
Why this, here: The single most consequential decision in a distributed data design.
Checkpoint
Stop and defend: pick a partition key for a messaging app’s messages table and say what breaks if you partition by sender_id versus conversation_id. If both sound fine, the choice isn’t load-bearing yet — re-read hot partitions.
- 4Skill
indexing-strategies
The index you picked three months ago decides your query latency today — and the one you didn't create decides which queries you can't ship. Indexing is not "add indexes until it's fast"; it's a first-principles match between query shape and index structure.
Why this, here: The indexes decide your read latency. B-tree vs LSM vs inverted is a first-principles choice.
- 5Skill
Consistency trade-offs
CAP is not a trivia question. It's the trade-off that every distributed system lives under, and getting it wrong is how you end up with "strong consistency" backed by a single node — or "eventual consistency" on data that absolutely cannot be eventually wrong.
Why this, here: Ties the storage choices back to the business-level consistency requirements.