Learn

March 12, 2024

5 Minute Read

Scalability in IT: The Complete Guide To Scaling

By Joseph Nduhiu

Somewhere in the IT multiverse, a perfect balance has been achieved between demand for IT services and installed system capacity.

Unfortunately, that isn’t our world.

IT systems operate in swing periods of idle capacity and overloads, as the ebb and flow of demand is influenced by various internal and external factors. For example, peak periods such as Black Friday and Cyber Monday can cause a significant strain on computing resources required to support global e-commerce shoppers looking for the best deals.

Statistics from Cloudflare in 2023 showed a 27% increase in traffic through their network from the previous year on these days. As enterprises implement their digital transformation strategies and develop new products, the growth in transactions and data requires IT resources that are able to handle the increase — without impacting performance or user experience.

Chart depicting the daily HTTP Requests for Cloudflare year 2023

Daily HTTP Requests for Cloudflare, 2023

What does scalability mean?

Gartner defines scalability as:

“The measure of a system’s ability to increase or decrease in performance and cost in response to changes in application and system processing demands.”

In the technology space, scalability is one of the main selling points of migrating to the cloud versus maintaining on-premise data centers. An organization that acquires cloud services is given a promise of accessible resources that:

Can be ordered and provisioned over a short time period to address growing information processing needs.
Can also be released when the organization does not require them.

This flexibility means that enterprises do not worry too much about tying their hard-earned capital in IT infrastructure and systems that may not match fluctuating demand.

Scalability vs. elasticity: same or different?

Sometimes the terms scalability and elasticity are often used interchangeably. But are they really the same thing?

Of five essential characteristics of the cloud computing model defined by NIST, one is rapid elasticity. This is where capabilities can be elastically provisioned or released to scale rapidly outward and inward commensurate to demand. The general agreement is this:

Scalability is viewed from a load handling perspective.
Elasticity is considered as the description of the speed of response to demand.

Indeed, the AWS glossary defines scaling as the outward or inward change in size, configuration of make up of a logical group of compute instances.

Approaches to Scalability

There are two main approaches that are used in describing scaling in a cloud computing environment: vertical scaling and horizontal scaling.

Vertical scaling

Vertical Scaling (scaling up) involves the upgrading of the resources of the existing virtual machines to cater for increased demand. Components that can be upgraded include:

CPU
Memory
Storage
Network throughput

Examples include virtual machines and compute resources which can be resized to accommodate performance requirements.

Horizontal scaling

Horizontal Scaling (scaling out) involves increasing the number of computing instances in a logical pool i.e. replication due to increased demand. Examples of horizontal scaling include:

Load balancers, which distribute traffic across multiple instances.
Kubernetes, which orchestrates containers.

Choosing the right scalability approach

The decision on what approach to take is mainly driven from the application architecture, as applications that can be easily distributed across multiple servers (such as stateless microservices) are more likely to be catered for by horizontal scaling. Other parameters include:

Traffic demand
Costs consideration
Resource efficiency
Performance requirements

From an uptime perspective, we can say this:

Horizontal scaling is more suitable as it does not require taking an existing server offline for upgrades.
In contrast, where resource intensity is key, then vertical scaling becomes the more preferrable approach.

(Learn all about load balancing for microservices.)

Diagonal scaling

Combining the two approaches results in a third hybrid model, i.e. diagonal scaling. This starts as vertical scaling, but once the resources are capped, then horizontal scaling kicks in.

This approach is deemed to be good for organizations who face unpredictable demand — hence the need to be able to respond in an agile and flexible way without restriction. However, it is obviously costlier and has higher operational complexity compared with the previously mentioned approaches.

Automation of scaling

Automating of scaling is usually the preferred approach for horizontal scaling. That’s because it does not involve disruption of services running on existing instances.

Autoscaling adds virtual machines to a group of instances and deletes them based on traffic as well as other configured parameters. For example, on Google cloud, the autoscaling parameters that come into play include:

CPU utilization: The percentage load that the CPU is handling over a time period.
Throughput: The limit of requests per second that can be handled effectively.
Latency: How long a request stays on a queue before being processed.
Instance count: The number of minimum and maximum instances in a logical group.

Database scalability

For databases, there are two main approaches to scaling:

Replication involves creating of copies of the database, where the copies are a replica of the original (primary), and data is synchronized across all copies starting from the primary.
Partitioning/sharding involves two parts: dividing the database into multiple parts, and distributing data based on an agreed strategy. This approach introduces more complexity and overhead in managing data that is spread across a cluster.

(Related reading: distributed systems and distributed tracing.)

Benefits of scalability

The main benefit of scalability is assurance: you want to assure your business of the reliability of IT services you’re delivering, both to internal stakeholders and end-users, customers, and prospects.

By planning the right capacity to address demand and performance requirements, and being able to respond smoothly to changes in traffic, the quality of IT services offered to the organization remains in line with expectation leading to improved customer satisfaction.

Whenever incidents occur, scalability supports high availability as instances are spun up quickly with similar configurations to handle the service requirements. This is a form of self-healing: new instances are created that are not affected by any disruption affecting existing instances.

Other benefits include:

Cost effectiveness. The organization does not need to tie down capital in investing in infrastructure that is not utilized. Scaling ensures that demand is responded to with just the right amount of capacity, that can be quickly reduced when the demand dissipates.
Disaster recovery. Where horizontal scaling is spread across geographically distributed zones, the probability of downtime totally crippling an IT service is reduced.

Scalability practices

Even where scaling is automated, do not assume that configuring scaling is a one-time set-and-forget activity. IT and system administrators must constantly monitor and analyze traffic trends and end-to-end application performance metrics in order to select the most optimal scaling metrics for their systems.

Observability tools

The right metrics will depend on a situational basis — which is precisely why you need to constantly review and optimize the scaling configurations.

Investing in observability tools is a wise option. By aggregating metrics and logs, alongside additional data, these tools can predict potential bottlenecks or failures that can impact application performance and therefore require optimization of scaling parameters.

Serverless computing

Some organizations have chosen to outsource the scaling headache to cloud service providers by adopting serverless computing. Applications built on serverless infrastructure have the benefit of automatic scaling, since the backend is fully managed to handle whatever traffic is generated from user transactions.

But beware: serverless alone is not a magic bullet to addressing scaling challenges, as the wrong application design could lead to certain functionality not scaling in tandem, thus causing bottlenecks. So, admins must regularly monitor application performance against the set limits, and initiate optimization when required.

See an error or have a suggestion? Please let us know by emailing splunkblogs@cisco.com.

This posting does not necessarily represent Splunk's position, strategies or opinion.

Joseph Nduhiu

Joseph is an ICT consultant and trainer with over 18 years of global experience across multiple sectors. His passion is assisting business units and IT departments in executing their digital transformation strategies and streamlining their operations in line with global standards and best practices. His areas of expertise include business process reengineering, IT service management, project management and cyber resilience. You can connect with Joseph @josephnduhio and on LinkedIn.

Learn 4 Min Read

The TDIR Lifecycle: Threat Detection, Investigation, Response

Threat Detection, Investigation and Response (TDIR) is a risk-based approach to mitigate cybersecurity threats and to more efficiently detect threats.

Learn 10 Min Read

What is GitOps: The Beginner's Guide

In this blog post, we’ll look at how GitOps works, why you should use it and how you can apply it. We’ll also look at how to get started with GitOps along with some best practices to help set you up for success.

Learn 6 Min Read

Customer Experience (CX) Metrics

Learn everything you need to know about network architecture and follow the best practices to maintain your architecture’s security.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram