Load-Balancing strategies

A load-balancing strategy is a strategy allowing to spread the load (e.g. HTTP requests) over a fleet of instances providing some functionality. If you’re like me you probably think of a sort of load-balancer (e.g. Nginx, AWS ELB, …) in front of several nodes (e.g. EC2 instances, docker containers, …). This is, indeed, a very common pattern but there are other alternatives with different pros and cons.

Server-side load-balancing

Let’s start with the most common approach: the server side load-balancing (a.k.a Proxy load-balancing). In this configuration there is a load-balancer in front of a fleet of servers. Clients connect to the servers by going through the load-balancer.

The load-balancer acts as a proxy and isolates the clients from the backend nodes.

Pros

  • Simple client configuration: only need to know the load-balancer address
  • Clients can be untrusted: all traffic goes through the load-balancer where it can be looked at
  • Clients are not aware of the backend servers

Cons

  • Single point of failure: If the load-balancer fails there’s no connection to the servers
  • Load-balancer might be a bottle-neck as it concentrates all the traffic
  • Scaling is limited by the load-balancer capabilities
  • Increased latency because of the load-balancer proxy extra hops

Use-cases

This is a typical architecture for web-architecture (Nginx, AWS load-balancers) as it isolates the clients from the backend infrastructure.

There are mainly 2 types of proxy load-balancers:

  • Network load-balancer
  • Application load-balancer

Network load-balancers terminate the connections from the client and open a new connection to the chosen backend server. TCP-packet are copied from the client connection are copied over to the backend connection.

Application load-balancers also terminate the connections from the client but they can look at the data inside a TCP-packet to chose the appropriate backend server. They can provide richer functionality (sticky sessions – all connection from the same session are handled by the same backend server). They offer more functionalities but are also slower than their network counterparts.

Client-side load-balancing

If there is a server-side load-balancing there should be a client-side load-balancing as well. And there is one such strategy. In this case the load-balancer is just gone and the clients choose which server to connect to.

Of course this implies that somehow the clients need to know about the server instances and implement an algorithm to determine which server to pick from the list. It can be something as simple as a round-robin mechanism, a hashing mechanism or even having the server report their actual load to the clients.

In this case there is a direct connection between clients and server instances and therefore clients needs to be trusted. (Just forget about it if your clients are everywhere over the internet).

Obviously the clients are more complex as they need to know implement the load balancing strategy themselves. However this can be facilitated by a framework or library (e.g. akka-cluster).

As the load-balancer is gone you gain better latency and improved scalability (just add more servers) but have to pay the price of additional complexity.

Pros

  • Improved latency: direct connection from client to backend server (no proxy extra hops)
  • Improved scalability: scalability depends only on the number of servers (no bottleneck)

Cons

  • Complexity has to be handled by the clients
  • Complex maintenance (libraries update …) as the implementation is language specific
  • Clients must be trusted

Use cases

The Cassandra client can be considered as a client-side load-balancing. The client is aware of the nodes in the cluster by contacting a seed node and selects the most appropriate node to send a request to according to the partition key.

Here it’s the distribution of the partition keys that spread the load fairly. However if one key is used more than other the load won’t be shared equally (hence the importance to choose the partition key wisely).

Unlike with proxy load-balancing, A/B testing (redirecting only a portion of the traffic to specific instances) is more difficult to do.

External load-balancing

This is somehow the hybrid approach. In this configuration cluster state (including server loads …) is maintained by an external load-balancer (e.g. Zookeeper) and clients ask the load-balancer which servers they should connect to.

If the load-balancer returns a list of servers the clients can implement a simple load-balancing strategy (round-robin, random, …). The results from the load-balancer might also be cached allowing the clients to connect to the servers even when the load-balancer is not available.

This approach moves some complexity from the clients to the external load-balancer while preserving direct connection to the servers.

Pros

  • Improved latency: direct connection from client to backend server (no proxy extra hops)
  • Improved scalability: scalability depends only on the number of servers (no bottleneck)
  • Reduced complexity on the clients

Cons

  • Some complexity still has to be handled by the clients
  • Complex maintenance (libraries update …) as the implementation is language specific
  • Clients must be trusted

Use cases

This approach makes sense for micro-service communications where you control both clients and servers. This is also the recommended approach for GRPC.

This approach is also known as service discovery and if you’re on AWS you probably can implement it by using the ELB or ECS APIs.