When to scale your system

Today you are going to learn a little bit about scaling your applications.

Most people overcomplicate scaling way too early.

They start thinking about Kubernetes, microservices, autoscaling groups, queues, read replicas, and multi-region deployments before they even have consistent users.

But for most apps, the first two stages are much simpler.

Stage 1: one VM.

The simplest real production setup is usually one VM running your entire app.

You rent a machine, maybe something running Ubuntu with 4GB of RAM, and on that same box you run Nginx, your application, Postgres, and your cron jobs.

Traffic comes in from the internet, Nginx routes it to your app, your app talks to Postgres over localhost, and your background jobs run on a schedule from the same machine.

This is not bad architecture (for the price point and simplicity).

For an early app, this is often the right architecture because it is cheap, easy to reason about, easy to deploy, and there are very few moving pieces.

The downside is that everything now depends on that one machine.

If the VM goes down, your whole app goes down. If your app starts using too much CPU, it can hurt your database. If your database starts eating memory, it can hurt your app. If a cron job goes crazy, it can hurt everything.

So stage one is great when you are still validating the product, your traffic is low, and simplicity matters more than resilience.

At this point, your goal is not to build the perfect architecture. Your goal is to get something working front of users without creating a system that is harder to operate than the product is worth.

Stage 2: separate the app and the database, add monitoring and analytics

The next step is NOT rewriting everything as microservices.

The next step is usually much more boring: put your app server and your database on separate machines.

Now one VM runs your web server, app code, and cron jobs, while another machine runs Postgres.

That one change gives you a lot of breathing room.

Your app and database no longer fight for the same CPU and memory. You can scale the app server separately from the database. You can tune the database machine for database workloads. You can back up, monitor, and secure the database more seriously.

This is also the stage where basic observability starts to matter a lot more.

You want structured JSON logs so that when something breaks, you can search what happened instead of guessing.

You want metrics and alerts for things like CPU, memory, p95 latency, error rates, and pages when important thresholds are crossed.

You also want product analytics for events, funnels, and retention, because it is not enough to know that your app is technically running. You also need to know whether people are actually using the thing you built.

This is when your app starts feeling like a real product because you can actually see what is happening.

So when do you scale?

You scale when you have evidence that the current setup is becoming the bottleneck.

Maybe CPU is consistently high, memory is getting squeezed, the database is slowing down, p95 latency is getting worse, or error rates are increasing.

Maybe your users are growing, your cron jobs are interfering with normal traffic, or your database now needs better backups, isolation, and security.

That is when you move from stage one to stage two. Start simple, add visibility, and scale when the system tells you to.

Best, Arjay

P.S.

If you’re looking for a way to practice system design daily, check out The Daily Dev on IOS and the Web.

This week is all about queues! One of my personal favorites for system design. You can do all the questions on them during your free trial this week. Check it out here.

When to scale your system

Keep reading

The Dev Download