Modern applications are constantly changing, evolving with new requirements and exist in an environment with varying demands on resources. Scaling an application can appropriately size it to resource demands to ensure happy customers and reduce infrastructure costs. If you don’t know how to scale efficiently, you are not just doing a disservice to your application, you are putting unnecessary stress on your operations team. Manually trying to determine when to scale up or out is extremely difficult. If you buy more infrastructure to accommodate your peak traffic, you could be overspending when load is not at peak. If you target your average load, spikes in traffic will impact your application performance and, when traffic drops, these resources will go unused.

What is Scale up vs Scale out?

Scaling out, or horizontal scaling, contrasts to scaling out, or vertical scaling. The idea of scaling cloud resources may be intuitive. As your cloud workload changes it may be necessary to increase infrastructure to support increasing load or it may make sense to decrease infrastructure when demand is low.  The “up or out” part is perhaps less intuitive. Scaling out is adding more equivalently functional components in parallel to spread out a load. This would be going from two load-balanced web server instances to three instances. Scaling up, in contrast, is making a component larger or faster to handle a greater load.  This would be moving your application to a virtual server (VM) with 2 CPU to one with 3 CPUs.  For completeness, scaling down refers to decreasing your system resources, regardless of whether you were using the up or out approach.

Scale up 

Resources such as CPU, network, and storage are common targets for scaling up. The goal is to increase the resources supporting your application to reach or maintain adequate performance. In a hardware-centric world, this might mean adding a larger hard drive to a computer for increased storage capacity.  It might mean replacing the entire computer with a machine that has more CPU and a more performant network interface. If you are managing a non-cloud system, this scaling up process can take anywhere from weeks up to months as you request, purchase, install, and finally deploy the new resources.  

In a cloud system, the process should take seconds or minutes. A cloud system might still target hardware and that will be on the tens of minutes end of the time to scale range. But virtualized systems dominate cloud computing and some scaling actions, like increasing storage volume capacity or spinning up a new container to scale up a microservice can take seconds to deploy. What is being scaled will not be that different. One may still shift applications to a larger VM or it may be as simple as allocating more capacity on an attached storage volume. 

Regardless of whether you are dealing with virtual or hardware resources, the take-home point is that you are moving from one smaller resource and scaling up to one larger, more performant resource.

Scale out

Scaling up makes sense when you have an application that needs to sit on a single machine. If you have an application that has a loosely coupled architecture, it becomes possible to easily scale out by replicating resources. 

Scaling out a microservices application can be as simple as spinning up a new container running a webserver app and adding it to the load balancer pool. When scaling out the idea is that it is possible to add identical services to a system to increase performance.  Systems that support this model also tolerate the removal of resources when the load decreases.  This allows greater fluidity in scaling resource size in response to changing conditions.  

The incremental nature of the scale out model is of great benefit when considering cost management.  Because components are identical, cost increments should be relatively predictable. Scaling out also provides greater responsiveness to changes in demand.  Typically services can be rapidly added or removed to best meet resource needs.  This flexibility and speed effectively reduces spending by only using (and paying for) the resources needed at the time. 

 Opsani can help you.

When it comes to how to scale, there is a continuum of effort:

  • Reactive/Manual. You use metrics to evaluate resource use and manually calculate costs, using a tool like Kubecost. When you see the load grow or receive a notification from your metrics service of a load threshold being crossed and reactively make an adjustment as a result. When it is decided that the system must be scaled, the process to do so is implemented manually.
  • Manual/Semi-automated. Your metrics system and infrastructure management system are linked.  You continue to periodically evaluate costs based on feedback from your metrics system, check on current cloud provider costs, and then update your orchestration system (e.g. Kubernetes) to autoscale when specific load limits are reached.
  • Fully Automated. You evaluate cost and performance goals proactively and automatically tune the system to maintain desired limits using AI-driven software. This can be done using Opsani.

Scaling allows you to meet your customer’s demands for quality service while minimizing the cost of providing that quality service. Opsani works to ensure your application is running efficiently and at the lowest cost possible. Moreover, we want to ensure you get the performance you need while spending the least amount of money to achieve it. Opsani lets you focus on delivering the core values of your business by automating away the toil – the repetitive and manual tasks – associated with optimizing systems that are constantly changing.  Opsani leverages artificial intelligence and machine learning, particularly deep reinforcement learning, to predict traffic spikes and resource requirements will accurately predict the best moment to scale up or down, and seamlessly integrates with AWS tooling to automate the scaling process. No toil necessary. To find out how Opsani can reduce your AWS spend, check out AWS does not Equal Cloud Optimization.