Load Balancers, Explained

Srihari Prabhakar

25 Sep 2025 — 2 min read

At some point, every growing app hits the same problem: one server isn’t enough. You add more, but then the question becomes - who decides where traffic goes? That’s where load balancers come in. They quietly sit in front of your app, deciding which server gets which request, and making sure users don’t notice when things go sideways.

The Basics

What it is - A load balancer distributes incoming traffic across multiple servers.
Why it matters - It keeps apps responsive, reliable, and scalable.
Where it runs - At the network edge, inside your VPC, or even as part of an app service.

Think of it like a traffic cop at a busy intersection - directing cars so no single lane gets clogged. Or a waiter at a restaurant - spreading orders across chefs so one doesn’t collapse under 50 burger tickets.

Why It Exists

Without load balancers, adding more servers doesn’t guarantee better performance. One server could be slammed while others sit idle. Worse, if a server fails, users would be stuck hitting the dead one. Load balancers solve both problems - spreading the load and rerouting traffic automatically when something breaks.

Common Pitfalls

A few mistakes I see often:

Treating it like magic - if your servers aren’t healthy, a load balancer won’t fix that.
Forgetting session state - users bounce around servers, so apps need sticky sessions or external state storage.
Ignoring costs - load balancers aren’t free, and high-throughput ones can add up.
Overcomplicating - not every app needs multiple layers of load balancing.

Why It Matters

Load balancers show up in every cloud:

On AWS, Elastic Load Balancer (ELB) comes in flavors like Application, Network, and Gateway LB.
On Azure, you get Azure Load Balancer and Application Gateway.
On DigitalOcean, there’s a managed Load Balancer that auto-integrates with Droplets and Kubernetes clusters.
On Oracle, you’ve got Load Balancer as a Service with policies for round-robin, least connections, and IP hash.

They all do the same core thing - make sure traffic flows evenly, reliably, and securely.

The TAM Lens

Load balancing is one of those quiet topics that shows up in every architecture conversation. It’s not about the tool itself, it’s about the outcomes: resilience, performance, and user experience. A startup might not need one until traffic spikes. An enterprise app won’t launch without it. The right design is usually the simplest one that keeps users happy and outages invisible.

How to Stay Sane

Start with managed services - they handle the heavy lifting.
Keep health checks simple - complex rules create false alarms.
Think about state - use sticky sessions or shared stores.
Plan for costs - add them to your architecture budget early.
Don’t overengineer - sometimes one layer is all you need.

Final Thoughts

Load balancers aren’t flashy, but they’re the reason apps scale gracefully instead of crashing under pressure. They keep traffic flowing, hide failures, and make your app feel smoother than it really is behind the scenes.

Infrastructure Drift, Explained (When Prod Stops Matching the Diagram)

At some point, every infrastructure diagram becomes fiction. A quick fix here. A hot patch there. A change made “just this once” to get things back online. Weeks later, production still works, but nobody is entirely sure why. That quiet gap between what you think is running and what’s

High Availability vs Fault Tolerance, Explained (Why Uptime Isn’t Binary)

Uptime numbers look simple. Systems are either up or down. In reality, availability is a spectrum, and fault tolerance sits at the far and expensive end of it. High availability and fault tolerance are often used interchangeably. They shouldn’t be. They solve different problems and come with very different

Stateful vs Stateless Applications, Explained

Scaling in the cloud often sounds easier than it actually is. Add more instances. Put a load balancer in front. Problem solved. Until traffic increases, users start getting logged out, and things behave inconsistently across servers. More often than not, the root cause isn’t compute or networking. It’s

Designing for Failure, Explained (Because Things Will Break)

Cloud outages rarely start with something dramatic. It’s usually a small change. A dependency times out. A zone hiccups. Someone deploys on a Friday. Designing for failure isn’t about being pessimistic. It’s about being realistic. In the cloud, failure is not a surprise event. It’s a