Scalability is a system’s ability to add resources to keep up with growing demand. When more more users begin using an application, infrastructure with great scalability will handle it without interrupting services. An infrastructure with poor scalability will likely cause slowdowns or disruptions.
Vertical & Horizontal Scaling
In practice, scalability can be achieved either through vertical scaling or horizontal scaling. Vertical scaling means adding computing resources, such as increasing network speeds, storage, or RAM. Horizontal scaling means adding more servers (or “nodes”) that each run the application. A tool called a “load balancer” then distributes the work across the many servers.
Vertical scaling is relatively simple and affordable, as it only involves upgrading a machine. However, there is some downtime required to perform the upgrade.
Horizontal scaling has the benefit of not requiring any downtime for existing servers. This benefit is the main reason why this is the scaling option chosen by most DevOps teams. That being said, it is slightly more complex to manage the many servers, and it is more expensive than upgrading existing machines.
The Price of Scaling
If scalability is so important, why not just run an application on a million powerful servers? Surely this would be enough to keep up with plenty of growth? Probably, but companies still need to consider the cost of their infrastructure. Scaling is about finding that sweet spot — enough to perform well but not so much that money is wasted.
This leads to another important goal when it comes to infrastructure — elasticity. Whereas scalability only deals with increases in resources, elasticity is the ability to automatically add or subtract resources to accommodate fluctuating demand. Elasticity is especially important when using pay-per-use infrastructure services since resources can be returned, and money can be saved, when demand shrinks.
Here we have the same bouncing ball application from the previous exercise. Each time a user is added to the application (and a new ball appears), an additional server is used. Play with the slider to simulate adding more users.
How does the experience differ from the last version of our app? Does it offer better performance? And which type of scaling does this application use?
You may notice that the performance of the bouncing ball application scales smoothly with the number of “users” active in the application. Since an additional server is added to handle each new user, the application is able to maintain high performance. This is an example of horizontal scaling.
DevOps uses several techniques to achieve both elasticity and scalability. Automation, cloud-based infrastructure, and microservices are some of these practices. In the next exercise, we’ll explore an important technology that leads to more scalable infrastructure.