Load Balancing is the process of distributing network traffic among multiple server resources. The objective of load balancing is to optimize certain network operations.
Ensuring that a workload is spread evenly among the computing resources, this “balanced load” improves application responsiveness and accommodates unexpected traffic spikes — all without compromising application performance.
Let’s take a deeper look at this important networking function.
A load balancer is a device that automatically distributes network traffic across a cluster of servers. A load balancer could be either an actual hardware device or a software application that runs on other networking hardware.
Load balancing is an important component in fault tolerant systems. In terms of network operations, load balancing helps to optimize:
Generally, this cluster of servers is likely housed in the same data center. However, with more companies moving workloads to the cloud, load balancers can also balance traffic across multiple data centers.
(Related reading: distributed systems & distributed tracing.)
Load balancers can operate on different network OSI layers depending on how you want to filter traffic and make decisions about how to filter it.
For example, a load balancer could operate:
With the goal of balancing workloads, let’s look at a couple examples:
Distributing workloads. Let’s say you have five servers. Each server hosts an instance of your application.
One scenario is that single server (instance) handles all the incoming requests, while the four other servers remain idle. Here, it’s easy for one server to get overwhelmed with traffic, taking a much longer time for that app to respond, often longer than the end user — a person or an API, for example — is willing to wait.
A better scenario, we can quickly see, is to use all five available servers. It’s the load balancing function that distributes the work across all five servers. Now you avoid overburdening any single server.
Ensuring uptime for apps & services. The distribution function of load balancers ensures normal operations. It also helps in events or incidents that require a response.
For instance, if one of your five servers goes down, the load balancer can redistribute that fifth workload across the remaining four servers that are up.
(Uptime is critical to regular operations; unplanned downtime can cost you significantly.)
With the basic concepts clear, let’s look at some algorithms and strategies you can set for load balancers, determining which data should be sent where.
A variety of algorithms used for load balancing generally fall into two categories:
Static load balancing algorithms and strategies are based on the system state and fixed rules. These algorithms are sub-optimal and suitable for small data centers with predictable network traffic workloads.
These algorithms may not converge to the best routing path for every single traffic request, but they do converge efficiently to an acceptable route and end-point server. (The optimal or best path is one that adheres to all constraints that the load balancing algorithm accounts for.)
In practice, these constraints may have conflicting objectives. Some constraints may be relaxed, while others may be rigid or weighted based on external parameters. For a static load balancing algorithm, these parameters and system states are well defined — prior knowledge of the system is therefore required.
Static load balancing algorithms operate in a fixed range of server attributes and cannot adapt to network traffic and workload changes in real-time.
Common examples of Static Load Balancing algorithms include Round Robin and IP Hash.
Round-robin load balancing is a simple scheme to distribute traffic requests among multiple servers. The load balancer assigns a request to each server in a round-robin fashion.
There’s also a Weighted Round Robin scheme, where computing resources are ranked based on the available capacity. Instead of distributing user requests evenly, servers with higher (weighted) capacity are assigned more requests.
This scheme produces a fixed-length hash value by converting source and destination IP addresses. The hash value assigns a routing path — multiple path combinations can have the same output.
This information serves as a compressed mapping between the user request and the destination servers. Here, algorithms such as Weighted Round-Robin can be used to select a route to forward packets.
Now let’s turn to load balancing that is dynamic.
Dynamic Load Balancing algorithms and strategies account for network capabilities at the nodes and network traffic workloads in real-time. The algorithm acquires, in real time, knowledge of:
These algorithms are typically complex and resource-intensive but the tradeoff here is that you’re getting produce optimal load balancing routes. Large private cloud data centers and public cloud networks typically rely on dynamic load balancing to optimize, distribute and scale resources for highly unpredictable network traffic workloads.
Least Connections and Least Response are two common algorithms for dynamic load balancing.
This scheme actively monitors each server for the number of connections established with an end-user. The algorithm has effectively two steps:
Another variant of this algorithm ranks servers with a weighted scheme based on server capacity. The number of traffic requests routed to servers in a list ranked based on the least-connection algorithms depends on this weight, instead of simply following the number of connections established.
In this algorithm, efficient alternative monitors the real-time response time for all active servers. The response time includes the duration of opening a new connection and sending a response.
Regardless of the number of connections established or server capacity, some servers may perform faster due to:
This algorithm may be combined by Resource based methods that primarily account for compute capacity available among servers in real-time.
So, how to choose a load balancing scheme for your data center? Let’s review some of the important considerations for choosing load balancing schemes and algorithms:
A load balancing algorithm must account for a variety of network environment factors for ranking and selecting node sequence for a route path. Nodes that are located within the same data center are subject to similar run-time attributes.
These parameters can vary unpredictably and cause unexpected latency between nodes that are distant and geographically disparate locations.
(Related reading: availability zones for global data centers.)
Network nodes and features (including all decision criteria, such as server metrics, network environment parameters, user and request profiles) can grow exponentially in a highly scalable network. The compute and storage requirement can also grow exponentially with the scaling network.
This is where a simple static load balancing algorithm can outperform a complex dynamic load balancing algorithm. The complexity of a load balancing algorithm is also measured in terms of its implementation and operation.
Cloud networks process millions of concurrent user requests. Therefore, they require a load balancing algorithm that can route optimally (and efficiently) in real-time.
Simple load balancing algorithms with minimal storage requirements and no single-point of failure are therefore suitable for complex networks – both for static and dynamic load balancing use cases.
Load balancing is a data-driven problem that is now increasingly solved by autonomous agent-based dynamic load balancing schemes that rely on advanced AI models:
The next wave of load balancing paradigms aim for application awareness within multi-cloud environments.
One of the key limitations for the effectiveness of advanced AI-based algorithms is the visibility and control into third-party cloud environments. Limited real-time visibility means that users have limited feature-rich information to train their AI models. An AI model that is not trained by sufficient high quality data can be outperformed by a simple alternative – including static load balancing algorithms – that rely on fixed metrics thresholds.
The key limitation of the simple alternatives is the lack of flexibility, which is a key criterion for highly scalable data center environments.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.