5 Common Misconceptions about AWS Auto Scaling

You can’t talk about cloud optimization without mentioning Amazon Web Services’ (AWS) Auto Scaling. It is one of AWS’s most powerful features. Maximizing AWS auto scaling groups is essential for businesses that run apps and infrastructure on AWS. But when it comes to AWS Auto Scaling, lots of people labor under various misconceptions. Here are five big ones:

1. Auto Scaling is Simple

With IaaS (Infrastructure-as-a-Service) platforms, the auto scaling process is very straightforward as opposed to the process of scaling up physical infrastructure in data centers. However, if you go to AWS and spin up an instance, you’ll immediately find that AWS Auto Scaling is not a default feature of the public cloud.

Building a resilient IT architecture demands substantial time investment upfront. Particularly if that architecture is to be automated, self-healing, and capable of replacing failed instances, as well as scaling out with little to zero human participation.

Putting up a load balancing group between various Availability Zones (AZs) is a very direct and uncomplicated process. However, you will need custom templates and scripts to automatically spin up instances running in ideal configurations and with minimal stand-up times. Developers need weeks or even months to get the scripts and templates right. If you are just getting started with AWS, you also have to consider the time required for engineers to learn and use AWS tools effectively.

Once your system is up and running, continuing maintenance of the templates and scripts is a time-intensive exercise. Even an experienced systems engineer takes a full month to fully acquaint himself with JSON in AWS CloudFormation. Many small engineering teams can’t afford to spend a whole month on this.

This lack of time is why many teams fail to achieve true AWS auto scaling. Instead, they settle for a combination of manual configuration and elastic load balancing. This means engineers allocate resources, internal or external, in creating template-driven environments. Doing so significantly decreases their buildout time by several orders of magnitude. By not taking a fully programmatic approach, there is also the continued burden of managing the manual aspects of keeping the system running smoothly.

2. Elastic scaling is more widely used than fixed-size auto scaling

AWS Auto Scaling does not automatically suggest load-based scaling. In reality, one can argue that auto scaling’s most frequent uses relate to high-availability and redundancy, rather than elastic scaling techniques.

Auto scaling groups are created mainly for resiliency. For this purpose, instances are placed into a fixed-size auto scaling group. Once an instance fails, it is replaced automatically. An auto scaling group with a max size of 1 and a min size of 1 is the simplest use case.

In addition, users are simply not limited to monitoring a CPU load threshold  as a trigger to scale a cluster. Auto scaling injects extra capacity to work queues, which is quite crucial in data analytics projects. A group of worker servers in an auto scaling group can tune in to a queue, implement their actions, and activate a spot instance once the queue size reaches a threshold figure. As with every other spot instance, this happens only if the spot instance price is lower than a specific amount set. This setup ensures that capacity is added only when it’s “nice to have”.

3. Capacity and Demand Should Always Match

Many IT and data professionals believe that load-driven auto scaling is ideal in all environments, and always produces effective cloud optimization. But that is simply another misconception about AWS Auto Scaling. 

The truth is that with limited auto scaling or even none at all, a number of cloud deployments exhibit a substantial degree of resilience. This is the case for many startups with fewer than 50 instances, where matching capacity and demand as close as possible results in unanticipated consequences.

Ideally, any increase in demand will be slow and predictable. However, huge spikes in usage can transpire over minutes, not days. It happens, and auto scaling can’t always keep up with the pace. If this is a regular occurrence, enterprises should reevaluate if scaling down to two instances during periods of low demand is prudent. Matching demand and capacity by scaling down may save you money in the short term, but your infrastructure runs the risk of downtime and unhappy customers if another spike happens.

4. No Lengthy Configurations for Perfect Base Images

Oftentimes, it is challenging to discover and strike a balance between what is integrated into your custom  “Golden Master” AMI (Amazon Machine Image) and the actions performed with a configuration management tool upon launch on top of a “Vanilla” AMI.

In reality, the configuration of an instance hinges mainly on the speed required for the instance to be spun up, the frequency of auto scaling events, the average life of an instance, and more.

Leveraging a configuration management tool and creating from Vanilla AMIs present users with a distinct advantage. If you are over 100 machines running, you can update packages in a single centralized location plus document all configuration modifications.

But in an AWS Auto Scaling event, downloading and installing 500MB of packages by a configuration management tool, like Puppet or any other script can be time-consuming. Furthermore, if the default installation process needs to accomplish more tasks, the likelihood of something going wrong increases exponentially.

As an example, complications do happen in a Puppet script. You update OpenSSL to the most recent version whenever Puppet runs. While this is a rare occurrence, temporary outages can and do happen while connecting to the package repository due to random network issues.

The initialization process not failing elegantly can set you back thousands of dollars. That’s because the instances die and get recreated over and over. Within an hour, you’ll be looking at 30 instances. If these are large, production instances, you’ll rack up a significant bill.

Striking a balance between these two approaches is possible. But it takes time and a lot of testing. Creating a custom Golden Image generated after running Puppet on an instance is the ideal scenario. To determine whether a deploy process is successful or not hinges on whether the instance functions the same or not when created from this created Golden Image as when built from the ground up with a Vanilla AMI configured by Puppet.

5. Auto Scaling is Ideal for Cost Optimization

One reason why enterprises turn to AWS Auto Scaling is to optimize their AWS resources and cut down their cloud costs. The reality is that AWS auto scaling does not equate to cloud cost optimization. That’s because workloads within the AWS cloud infrastructure are oversized. Containers and VMs are typically bigger than they need to be. Another problem is VM sprawl. This, unfortunately common occurrence, is when, for example, a network of unused instances that are left running but aren’t really doing anything useful, like a light bulb left on in an unoccupied room.

Artificial intelligence (AI) and machine learning(ML) to the Rescue

As you can see, there are multiple places where implementing auto scaling does not provide the necessary or expected value or cost savings.  The complexity results in toil, manual work repeatedly performed by engineers to optimize a system that continues to change.  Even with the dedicated efforts of human engineers, the complexity and continuing system variation mean that you will likely only achieve better for a time rather than continually and consistently approaching optimal performance and cost savings.

Cost optimization and performance optimization is best achieved by leveraging artificial intelligence and machine learning, particularly deep reinforcement learning, to predict traffic spikes and resource requirements. This is the approach taken by Opsani’s Predictive Auto Scaler.

This tool combines both AI and ML with deep reinforcement learning to analyze how an application performs based on the volume of requests it receives. The Predictive Auto Scaler then dives deep into the hidden relationship between the pod and traffic, with the AI algorithm working furiously and quickly to accurately predict the best moment to scale up.

Contact Opsani to know more about their cloud cost optimization technology and products. You can also sign up for a free trial to test and experience the Opsani advantage.