<< back to Guides

βš–οΈ Guide: Load Balancing β€” A Systems Design Deep Dive

Load balancing is a critical pattern in distributed systems, ensuring high availability, fault tolerance, and scalable performance by distributing traffic across multiple backend components.

This guide explores:


πŸ“¦ 1. What is Load Balancing?

Load balancing is the process of distributing incoming network traffic across multiple backend servers (or nodes) to ensure:

Basic Architecture

Client ─────┐
            β”œβ”€β”€β–Ά Load Balancer ──▢ Server 1
            β”œβ”€β”€β–Ά                  └─▢ Server 2
            └──▢                  └─▢ Server 3

βš™οΈ 2. Types of Load Balancing

🧱 Layer 4: Transport-level (TCP/UDP)

Example tools: HAProxy (TCP mode), AWS NLB

frontend tcp-in
  bind *:443
  default_backend web-servers

backend web-servers
  balance roundrobin
  server s1 10.0.0.1:443 check
  server s2 10.0.0.2:443 check

🌐 Layer 7: Application-level (HTTP/HTTPS)

Example tools: NGINX, Envoy, AWS ALB, Traefik

location /api/v1/ {
  proxy_pass http://backend-api-v1;
}

πŸ“Š 3. Load Balancing Algorithms

Algorithm Description Use Case
Round Robin Evenly distribute requests in order Stateless services
Least Connections Send to server with fewest active requests Long-lived or variable load
IP Hash Hash IP to route consistently Session affinity
Random Select randomly from pool Simple but not optimal
Weighted Favor stronger machines Mixed-capacity nodes

❀️ 4. Session Persistence ("Sticky Sessions")

Some apps (like login sessions) need a user to always talk to the same backend server.

Options:

// In NGINX with sticky cookie
upstream app {
  sticky cookie srv_id expires=1h domain=.yourapp.com path=/;
  server 10.0.0.1;
  server 10.0.0.2;
}

πŸ› οΈ 5. Health Checks & Failover

Liveness and readiness checks prevent LBs from routing traffic to unhealthy nodes.

// AWS-style HTTP health check
GET /healthz HTTP/1.1
Host: your-service.com

-- returns HTTP 200 if healthy

πŸ” 6. Load Balancers in Multi-tier Systems

Client β†’ Global LB (GeoDNS / Anycast)
        β†’ Regional LB (CDN / Edge)
            β†’ App LB (L7 routing)
                β†’ Internal LB (DB / Cache sharding)

Each layer may use a different technique and tool, e.g.:

Layer Tool
Global Cloudflare, Route53, GSLB
Regional Fastly, Akamai, AWS CloudFront
L7 App Balancer Envoy, NGINX, ALB
Internal L4 HAProxy, NLB, gRPC round robin

🧠 7. Trade-offs & Gotchas

Concern Watch Out For
Session Affinity Can break scalability if not managed
Health Check Frequency Too aggressive = false positives
DNS TTL Short TTL needed for dynamic IPs
Cache Invalidation Some LBs cache incorrectly at L7
Cold Starts New containers might fail health checks
Coordinated Omission Don’t ignore tail latencies in tests

πŸ“š 8. Real-World Load Balancers

Tool Type Notes
NGINX L7 Popular reverse proxy & HTTP LB
HAProxy L4/L7 Fast, widely used in infra
Envoy L7 gRPC/native HTTP2, dynamic config
AWS ALB/NLB L7/L4 Fully managed, auto-scaling
Traefik L7 Modern, cloud-native, dynamic
Kubernetes Services L4/L7 Built-in kube-proxy & ingress

πŸ§ͺ 9. Benchmarking Load Balancer Performance

To test throughput, latency, error recovery:

# Example: 100 connections for 30s test
wrk -t10 -c100 -d30s http://localhost:8080

🧰 10. Load Balancing in Kubernetes


βœ… Summary Table

Feature L4 Load Balancer L7 Load Balancer
Protocol Awareness TCP/UDP only HTTP, HTTPS, gRPC aware
Routing Logic IP/Port based URL, header, cookie
Performance Faster Slight overhead
Flexibility Lower Very flexible
Use Case DBs, gRPC, raw APIs Web apps, APIs, proxies

πŸ“š Further Reading


<< back to Guides