Good morning!
Here are the 6 major types of databases, how they work, and when you should actually reach for each one. If you’re more of an interactive learner, check out some quick simulations I built on my blog here.
1. Relational (Postgres, MySQL, SQLite)
These are basically spreadsheets on steroids. Rows, columns, tables, and relationships between them.
They're fast when you query by indexed columns, and slow when you don't. Under heavy load, expect to deal with connection limits and row locking on writes. But with indexing, read replicas, and caching, they scale further than people think.
Use when: You have structured data and want query flexibility + ACID. This should be your default in most cases.
2. Key-Value Stores (Redis, DynamoDB)
Basically a giant hash map with a database wrapper. Simple and ridiculously fast.
The biggest downside is you have almost zero query flexibility. If your access pattern isn't "give me the value for this key," you're going to have a bad time.
Use when: You need low-latency lookups by a single key at massive scale. PLEASE don't try to reinvent SQL here.
3. Wide Column (Cassandra, Bigtable)
Underrated, in my opinion. Data is stored by row keys and columns, but unlike relational DBs, rows don't need to share the same columns. You can add columns whenever you want.
This makes them incredible for write-heavy workloads. But you need to design carefully around your row key, because that's basically the only efficient way to access data.
Use when: You have massive write volume, predictable query patterns, and append-heavy data.
4. Object Storage (S3, GCS)
“But this isn’t a database” yeah yeah, still need to know about it and use it, so I’m including it. Stupidly simple. You put files, you get files, you delete files. No schema, no querying.
Just don't try to scan a bucket with millions of objects looking for something, get things by their exact key.
Use when: Your data is a file. Images, videos, backups, logs, model weights, anything where "just store this blob" is the point.
5. Vector Databases (Pinecone, Weaviate, pgvector)
These store embeddings and let you find things that are similar to other things. Super popular right now because of RAG and AI apps.
For example: a user asks ChatGPT for "funny cat videos," that query gets embedded, and the vector DB returns the closest matches, which then get fed back to the LLM as context.
Use when: Your access pattern is "find similar things." RAG, recommendations, semantic search.
6. Graph Databases (Neo4j, Neptune)
Nodes and edges. Perfect for "friends of friends" style queries, think LinkedIn's connection degrees.
Traversals get expensive fast if you go too deep. And most relational databases can handle simple graph-like queries just fine.
Use when: Your data is highly connected and multi-hop queries are core to your product. Otherwise, skip it.
P.S
The Daily Dev is now on the web! If you’re looking for a way to consistently practice system design, this is for you. If you’ve already subscribed on mobile, you can sign in on the web and vice-versa.
