SD/TL: Load Balancer/Team Lead

Load balancer and team lead think of the team as a system to produce efficient results, requiring consideration of every individual worker's well-being and growth

Aug 17, 2025

grayscale photography of pickup truck on road — Photo by Massimo Adami on Unsplash

This is a post in my “system design applied to tech leadership” series.

In this series I compare elements of tech leadership with elements of system design.

Tech Team Leader:

Whether you’re directly managing a team or leading a team as a team lead, one of your responsibilities involves capacity management at the team level. This means intelligently distributing work among the team members to:

minimize idle time
prevent burnout
satisfy current workloads
avoid single points of failure
don’t show favoritism
develop the team to address future demand

Load Balancer:

In system design, a load balancer serves a similar function, distributing work across a variety of servers.

Some overlapping duties of a load balancer to a tech team leader include:

minimize idle time
satisfy current workload
avoid single point of failure
don’t show favoritism
in a way, this results in preventing burnout
it can also develop the servers to address future demand, if you roll out changes to sets of the servers before rolling them out to all of the servers.

In the case of the load balancer, the most common example involves requests being distributed to web and application servers, but the concept holds across other elements of the system, as well.

In the case of distributed system load balancers, here is a summary of algorithms for distributing load:

Static Algorithms:
- Round Robin:
  - Requests are distributed to each server in a circular fashion, starting at the first server and sequentially moving through servers until finally looping back to the start.
- Weighted Round Robin:
  - Similar to round robin, but servers are assigned weights. This means that load can be distributed unequally, based on individual server capacity.
- Hash-based algorithms (e.g., Source IP Hash, URL Hash):
  - These algorithms use a hash function to map incoming requests to specific servers. As a result, requests from the same client are consistently routed to the same server, useful for session management and caching.
Dynamic Algorithms:
- Least Connections:
  - The load balancer directs traffic to the server with the least active connections. This assumes connections are proportional to load.
- Weighted Least Connections:
  - Similar to least connections, but servers with higher weights get to handle more connections.
- Response Time:
  - The load balancer sends traffic to the server with the least amount of response time [latency], resulting in a highly adaptive and reactive method.
- Resource-based algorithms:
  - These algorithms monitor server resources like CPU and memory usage to make intelligent routing decisions.

Compared to these common load balancer algorithms, an experienced team lead will distribute work in a much more nuanced and complicated manner:

Develop resources. Work is assigned to various folks on the team to develop their strengths in a particular way. This helps eliminate single points of failure, and can keep people inspired by growing their skillsets.
High priority = speed. Some high priority work takes a speed route and is assigned to the person most likely to complete the work the fastest. This can be accompanied by redistributed work. This is a collaboration smell [like a code smell] and points to a potential team balancing issue to address in future sprints.
Round robin. When possible, work is evenly distributed among team members.
Redistributed work. Work is reassigned from person to person based on changing priorities. This is typically a last resort.
Pairing. This distributes the same work to one person, with another person or people assigned to help.
Weighted round robin. Work is distributed as evenly as possible among team members, acknowledging some folks have more capacity during the given sprint.
Deliberate idle. Some folks on the team might downshift to less demanding tasks, reducing their capacity, after a period of strenuous effort. [Imagine vertically scaling a server down along with its load after its fans are overworked and it is running hot.]
Blue/green. Some folks test out a new workflow or process for delivering value, scaling it up across the team only if efficiencies are proven.
Chunked cross-pollination. This is where certain very large tasks which otherwise would go to an expert are deliberately broken into pieces and sent to less experienced folks to complete, so they gain confidence in what presents as a single-point-of failure. This frees the expert up to learn new skills, survey new landscapes, and get relief from burnout.

Just like workers can be scaled up [or down] behind a load balancer, team members can be scaled up [or down] on a particular team or effort, reallocating workers to/from other teams.

If a team scales large enough, it can have sub-team leads just like a load balancer can balance load among other load balancers in a hierarchical balancing system.

Erik’s Tech Leadership Substack

Discussion about this post