Five Ways to Run Kubernetes on AWS

[ARTICLE]

Five Ways to Run Kubernetes on AWS

If you have decided that Amazon Web Services (AWS) is the place you want to host your Kubernetes deployments, you have two primary AWS-native options – push the easy button and let AWS create and manage your clusters with Elastic Kubernetes Service (EKS) or roll up your sleeves and sweat the details with the self-hosted Kubernetes on EC2. In between these two levels of complexity are a number of install tools that abstract away some of the complexity of getting a Kubernetes cluster running on AWS. In this article we will look at the most popular AWS-compatible tools: Kubernetes Operations (kOps), kubeadm and kubespray.  

In this article we’ll cover the options for running Kubernetes on AWS in greater detail, provide some insight into prerequisites and provide resources to help you get up and running:

  • [easy] Creating a Kubernetes cluster with Elastic Kubernetes Service (EKS)
  • [less easy] Creating a Kubernetes cluster on AWS with kOps
  • [less easy, more control ] Creating a Kubernetes cluster on AWS with kubeadm
  • [less easy, more control, Ansible-centric ] Creating a Kubernetes cluster on AWS with kubespray
  • [hard, all the control] Manually creating a Kubernetes cluster on AWS with EC2 instances

Creating a Kubernetes Cluster on AWS with Elastic Kubernetes Service (EKS)

This is really the easy button when it comes to the options for running Kubernetes on AWS.  With this option, AWS simplifies cluster setup, creation, patches and upgrades. With EKS you get an HA system with three master nodes for each cluster across three AWS availability zones.

Although the simplest way to get a Kubernetes up and running on AWS, there are still some prerequisites:

  • An AWS account
  • An IAM role with appropriate permissions to allow Kubernetes to create new AWS resources
  • A VPC and security group for your cluster (one for each cluster is recommended)
  • kubectl installed (you may want the Amazon EKS-vended version)
  • AWS CLI installed

If you have your prerequisites in place, the following resources will guide you to getting your first EKS cluster up and running:

Creating a Kubernetes Cluster on AWS with kOps

Using Kubernetes Operations (kOps) abstracts away some of the complexity of managing Kubernetes clusters on AWS. It was specifically designed to work with AWS, and integrations with other public cloud providers are available. In addition to fully automating the installation of your k8s cluster, kOps runs everything in Auto-Scaling Groups and can support HA deployments.  It also has the capability to generate a Terraform manifest, that could be used in version control or could be used to have Terraform to actually create the cluster.

If you wish to use kOps, there are a number of prerequisites before creating and managing your first cluster:

  • have kubectl installed.
  •  install kOps on a 64-bit (AMD64 and Intel 64) device architecture.
  • setup your AWS prerequisites
  • set up DNS for the cluster, e.g. on Route53, (or, for a quickstart trial, a simpler alternative is to create a gossip-based cluster)

Once you’ve checked off the prerequisites above, you are ready to follow the instructions in one of the resources below:

Creating a Kubernetes Cluster on AWS with kubeadm

Kubeadm is a tool that is part of the official Kubernetes project.  While kubeadm is powerful enough to use with a production system, it is also an easy way to simply try getting a K8s cluster up and running. It is specifically designed to install Kubernetes on existing machines. Even though it will get your cluster up and running, you will likely still want to integrate provisioning tools like Terraform or Ansible to finish building out your infrastructure. 

Prerequisites

  • kubeadm installed
  • one or more EC2 machines running a deb/rpm-compatible Linux OS(e.g. Ubuntu or CentOS), 2GB+ per machine and at least 2 CPUs on the master node machine.
  • full network connectivity (public or private) among all machines in the cluster. 

The following resources will help you get started with building a K8s cluster with kubeadm:

Creating a Kubernetes Cluster on AWS with kubeadm

Another installer tool that leverages Ansible playbooks to configure and manage the Kubernetes environment.  One benefit of Kubespray is the ability to support multi-cloud deployments, so you are looking to run your cluster across multiple providers or on bare metal, this may be of interest.  Kubespray actually builds on some kubeadm functionality and may be worth considering adding to your toolkit if already using kubeadm.

Prerequisites:

  • uncomment the cloud_provider option in group_vars/all.yml and set it to ‘aws’
  • IAM roles and policies for both “kubernetes-master” and “kubernetes-node”
  • tag the resources in your VPC appropriately for the aws provider
  • VPC has both DNS Hostnames support and Private DNS enabled
  • hostnames in your inventory file must be identical to internal hostnames in AWS.

The following resources will help you get your Kubernetes cluster up and running on AWS with Kubespray:

Manually Creating a Kubernetes Cluster on EC2 (aka, Kubernetes the Hard Way)

If EKS is the “easy button,” installing on EC2 instances is the opposite. If you need full flexibility and control over your Kubernetes deployment, this may be for you. If you’ve spent any time with Kubernetes, you’ve almost certainly heard of “Kubernetes the Hard Way.” While KTHW originally targeted Google Cloud Platform, AWS instructions are included in the AWS and Kubernetes section.  Running through the instructions provides a detailed, step by step process of manually setting up a cluster on EC2 servers that you have provisioned. The title, by the way, is not a misnomer and if you do run through this manual process you will reap the rewards of having a deep understanding of how Kubernetes internals work. 

If you are actually planning to use your Kubernetes on EC2 system in production, you will likely still want some level of automation, and a functional approach would be to use Terraform with Ansible. While Terraform is much more than a K8s install too, it allows you to manage your infrastructure as code by scripting tasks and managing them in version control.  There is a Kubernetes-specific Terraform module that helps to facilitate this.  Ansible complements Terraform’s infrastructure management prowess with software management functionality for scripting Kubernetes resource management tasks via the Kubernetes API server.

The following resources will help you get started with creating a self-managed Kubernetes cluster on EC2 instances:

Conclusion

In this article, we considered five common ways to get a Kubernetes cluster running on Amazon Web Services.  Which one you choose will really depend on how much control you need over the infrastructure you are running the cluster on and what your use case is.  If you are just trying out Kubernetes or doing a dev environment to just try something out, a quick and repeatable solution is likely preferable. In a production system, you’ll want tools that simplify administrative tasks like rolling upgrades without needing to tear down the entire system.

The tools we covered are the most popular solutions for deploying on AWS. You may have noticed that there is a degree of overlap and integration among several of the approaches, so using kOps with Terraform to then install on self-hosted EC2 instances is a possibility. Kubernetes is known for being a challenge to manage manually, and the tools we covered are under active development to simplify that process. More tools are constantly being created to address specific use cases. For example, Kubicorn is an unofficial, golang-centric K8s infrastructure management solution.  While not all of the tools listed are AWS specific, you can explore the CNCF installer list from the CNCF K8s Conformance Working Group to get a better sense of the diversity of options available.


Instrumenting Kubernetes with Envoy for Application Performance Metrics

[ARTICLE]

Instrumenting Kubernetes with Envoy for Application Performance Metrics

Opsani COaaS (Continuous Optimization as a Service) optimizes runtime settings such as CPU, memory, and autoscaling as well as in application settings such as Java garbage collection time and database commit times.  Opsani performs this optimization by learning from application performance metrics (APM).

Envoy (https://www.envoyproxy.io/) is a self-contained layer 7 proxy process that is designed to run alongside an application server.  One of its proxy functions is to provide performance metrics.  In Kubernetes, Envoy allows you to instrument applications to obtain performance metrics without changing application code or disrupting your application in production.

While there are a variety of methods and tools for application performance metrics, in this step-by-step guide, we’ll walkthrough instrumenting your Kubernetes application for performance metrics with Envoy.  For this exercise, we’ll assume you have access to a fresh Kubernetes cluster (e.g. AWS EKS) and for simplicity we’ll be working in the default Kubernetes namespace.  Note: Opsani has packaged Envoy to include configurations to support Opsani application optimization.  The source code is publicly available and documented in GitHub at  https://github.com/opsani/envoy-proxy.

Deploy an Application to Monitor and Optimize

While Opsani can optimize applications throughout a variety of operating systems, clouds, programming languages, and continuous deployment platforms, we’ll use a very simple Kubernetes example.  Any server application will generally suffice, but for learning about Opsani application optimization, it’s helpful to be able to control the resources that matter, including CPU, memory, and response times. 

fiber-http (https://github.com/opsani/fiber-http) is an open source Opsani tool that lets you do just that.  fiber-http is a webserver with endpoints to control CPU and memory consumption as well as server response times.  With these controls we can simulate a loaded server in a simple, controlled manner.

For this exercise, since fiber-http is already in DockerHub, let’s create a minimal Kubernetes yaml manifest file to stand up a Kubernetes Deployment of a fiber-http container and a load balancer service for ingress traffic.  Note: If you are not 100% comfortable with editing yaml files, we suggest using an editor that will help with lining up columns of text.  VSCode is a good editor for that (https://code.visualstudio.com/).

fiber-http-deployment.yaml:


apiVersion: apps/v1
kind: Deployment
metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
spec:
 replicas: 1
 selector:
   matchLabels:
     app.kubernetes.io/name: fiber-http
 template:
   metadata:
     labels:
       app.kubernetes.io/name: fiber-http
   spec:
     containers:
     - name: fiber-http
       image: opsani/fiber-http:latest
       env:
       - name: HTTP_PORT
         value: "0.0.0.0:8480"
       ports:
       - containerPort: 8480
       resources:
         limits:
           cpu: "1"
           memory: "1Gi"
         requests:
           cpu: "1"
           memory: "1Gi"
---

apiVersion: v1
kind: Service

metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
 #annotations:
 #  service.beta.kubernetes.io/aws-load-balancer-internal: "true"

spec:
 type: LoadBalancer
 #externalTrafficPolicy: Cluster
 #sessionAffinity: None
 selector:
   app.kubernetes.io/name: fiber-http
 ports:
 - name: http
   protocol: TCP
   port: 80
   targetPort: 8480

Run this manifest in Kubernetes via:

% kubectl apply -f fiber-http-deployment.yaml

This results in a pod with a single fiber-http container, with inbound traffic brought in by the LoadBalancer Service.

fiber-http (without Envoy) inbound traffic flow:

You can start HTTP communications through the service to the pod via a web browser, but for testing and automation purposes, let’s use the curl command line tool.

First, obtain the address of the service.

% kubectl get service
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
fiber-http   LoadBalancer   10.100.125.89   a0564378f112548d5b11cbc806d5f34e-1268639300.us-west-2.elb.amazonaws.com   80:31961/TCP   25h
kubernetes   ClusterIP      10.100.0.1      <none> 
                                                                   443/TCP        6d

Use curl to start an HTTP connection to the application.

% curl a0564378f112548d5b11cbc806d5f34e-1268639300.us-west-2.elb.amazonaws.com

move along, nothing to see here% 

Refer to the fiber-http GitHib repository for instructions on how to communicate with fiber-http to control CPU load, memory consumption, and HTTP response times.  

Instrumenting a Kubernetes Deployment with Envoy

Now it’s time to re-deploy the application with metrics instrumentation, towards the goal of autonomous optimization! 

We’ll insert Envoy as a proxy in front of the fiber-http application pod.

We’ll need to insert Envoy between the Service and the fiber-http application container.  Let’s copy the original yaml into a new file so that it’s easy to compare “before” and “after” configurations.

fiber-http-envoy-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
spec:
 replicas: 1
 selector:
   matchLabels:
     app.kubernetes.io/name: fiber-http
 template:
   metadata:
     labels:
       app.kubernetes.io/name: fiber-http
       # *** ADD FOR OPSANI ***
       # Attach a label for identifying Pods that have been augmented with
       # an Opsani Envoy sidecar.
       sidecar.opsani.com/type: "envoy"
     annotations:
       # *** ADD FOR OPSANI ***
       # These annotations are scraped by the Prometheus sidecar
       # running alongside the servo Pod. The port must match the
       # `METRICS_PORT` defined in the Envoy container definition
       # below. The metrics are provided by the Envoy administration
       # module. It should not be necessary to change the path or port
       # unless the proxied service happens to have a namespace collision.
       # Any divergence from the defaults will require corresponding
       # changes to the container ports, service definition, and/or the
       # Envoy proxy configuration file.
       prometheus.opsani.com/scrape: "true"
       prometheus.opsani.com/scheme: http
       prometheus.opsani.com/path: /stats/prometheus
       prometheus.opsani.com/port: "9901"
   spec:
     containers:
     - name: fiber-http
       image: opsani/fiber-http:latest
       env:
       - name: HTTP_PORT
         value: "0.0.0.0:8480"
       ports:
       - containerPort: 8480
       resources:
         limits:
           cpu: "1"
           memory: "1Gi"
         requests:
           cpu: "1"
           memory: "1Gi"
     # *** ADD FOR OPSANI ***
     # Opsani Envoy Sidecar
     # Provides metrics for consumption by the Opsani Servo
     - name: envoy
       image: opsani/envoy-proxy:latest
       resources:
           requests:
             cpu: 125m
             memory: 128Mi
           limits:
             cpu: 250m
             memory: 256Mi
       env:
       # The container port of Pods in the target Deployment responsible for
       # handling requests. This port is equal to the original port value of
       # the Kubernetes Service prior to injection of the Envoy sidecar. This
       # port is the destination for inbound traffic that Envoy will proxy from
       # the `OPSANI_ENVOY_PROXY_SERVICE_PORT` value configured above.
       - name: OPSANI_ENVOY_PROXIED_CONTAINER_PORT
         value: "8480"


       # Uncomment if the upstream is serving TLS traffic
       # - name: OPSANI_ENVOY_PROXIED_CONTAINER_TLS_ENABLED
       #   value: "true"


       # The ingress port accepting traffic from the Kubernetes Service destined
       # for Pods that are part of the target Deployment (Default: 9980).
       # The Envoy proxy listens on this port and reverse proxies traffic back
       # to `OPSANI_ENVOY_PROXIED_CONTAINER_PORT` for handling. This port must
       # be equal to the newly assigned port in the updated Kubernetes Service
       # and must be configured in the `ports` section below.
       - name: OPSANI_ENVOY_PROXY_SERVICE_PORT
         value: "9980"


       # The port that exposes the metrics produced by Envoy while it proxies
       # traffic (Default: 9901). The corresponding entry in the `ports` stanza
       # below must match the value configured here.
       - name: OPSANI_ENVOY_PROXY_METRICS_PORT
         value: "9901"

       ports:
       # Traffic ingress from the Service endpoint. Must match the
       # `OPSANI_ENVOY_PROXY_SERVICE_PORT` env above and the `targetPort` of
       # the Service routing traffic into the Pod.
       - containerPort: 9980
         name: service


       # Metrics port exposed by the Envoy proxy that will be scraped by the
       # Prometheus sidecar running alongside the Servo. Must match the
       # `OPSANI_ENVOY_PROXY_METRICS_PORT` env and `prometheus.opsani.com/port`
       # annotation entries above.
       - containerPort: 9901
         name: metrics
---

apiVersion: v1
kind: Service

metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
 #annotations:
 #  service.beta.kubernetes.io/aws-load-balancer-internal: "true"

spec:
 type: LoadBalancer
 #externalTrafficPolicy: Cluster
 #sessionAffinity: None
 selector:
   app.kubernetes.io/name: fiber-http

 ports:
 # Send ingress traffic from the service to Envoy listening on port 9980.
 # Envoy will reverse proxy back to localhost:8480 for the real service
 # to handle the request. Must match `OPSANI_ENVOY_PROXY_SERVICE_PORT` above
 # and be exposed as a `containerPort`.
 - name: http
   protocol: TCP
   port: 80
   targetPort: 9980

You can use kubectl to apply these changes – even to a live running application.

Example:

% kubectl apply -f fiber-http-envoy-deployment.yaml
deployment.apps/fiber-http configured
service/fiber-http configured

We’ve “shimmed in” the Envoy proxy just in front of our application.  

fiber-http (WITH Envoy) inbound traffic flow:

Kubernetes

Verifying Envoy is gathering metrics from your container

You can scrape envoy via a curl by instantiating and accessing a shell in a Linux/busybox pod in the same namespace, and performing an http client command to pull metrics from Envoy.  But there’s a better way (see below).

Example:

(outside a Linux/busybox container in the same namespace):

% kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
fiber-http-6ccc567bf8-4psqz   2/2     Running   0          26h
% kubectl describe pod fiber-http-6ccc567bf8-4psqz

(obtain the IP address of the pod via “kubectl describe pod <fiber-http pod name>”, then shell into a Linux/busybox container in the same k8s namespace):

% kubectl run -i --tty --image=busybox --restart=Never -- sh
If you don't see a command prompt, try pressing enter.
/ # wget -qO- http://192.168.15.160:9901/stats
cluster.opsani_proxied_container.Enable upstream TLS with SNI validation.total_match_count: 0
cluster.opsani_proxied_container.Enable upstream TLS with validation.total_match_count: 0
cluster.opsani_proxied_container.Enable upstream TLS.total_match_count: 0
...
[more metrics follow] 

Kubernetes port-forward is a powerful test tool for communications debugging

Instead of creating a linux container to access Envoy, a less system-intrusive method is via the Kubernetes port-forward functionality.  Let’s port-forward TCP port 9901 on your machine running kubectl, to port 9901 in the pod, which is the listening port for the Envoy administration interface.

Syntax:

kubectl port-forward pod/{pod-name-of-an-injected-pod} local-port:destination-port

Example:

% kubectl port-forward pod/fiber-http-6ccc567bf8-4psqz 9901:9901
Forwarding from 127.0.0.1:9901 -> 9901
Forwarding from [::1]:9901 -> 9901

(This will continue to run until you exit via “control C”)

Now instead of running a local container to access Envoy, we can access from our kubectl machine.

% curl http://localhost:9901/stats/prometheus
# TYPE envoy_listener_manager_listener_modified counter
envoy_listener_manager_listener_modified{} 0
# TYPE envoy_listener_manager_listener_removed counter
envoy_listener_manager_listener_removed{} 0
# TYPE envoy_listener_manager_listener_stopped counter
...

Look for Metrics that Matter to your Application Performance

Envoy gathers many metrics about web server and application performance.  You can use either of the above methods to dump metrics while running load against the test application (fiber-http in this tutorial).  Here are some notable sample metrics from Envoy for application performance:

  • http.ingress_http.downstream_cx_total: 722  – This is the total number of client to server connections observed by Envoy since the last flush.
  • http.ingress_http.downstream_cx_length_ms: P0(1.0,1.0) P25(1.025,1.02601) P50(1.05,1.05203) P75(1.075,1.07804) P90(1.09,1.09365) P95(1.095,1.09886) P99(1.099,3.09743) P99.5(1.0995,5.041) P99.9(1.0999,432.82) P100(1.1,440.0)
    • Each P quantile entry shows the (interval, amount) of the length in ms of the connection.
    • This sample output was obtained with a simple shell while loop to fiber-http with no parameters
      • while true; do curl <k8s service>; sleep 1; done
    • fiber-http can simulate CPU load, memory, and response times by specifying URL parameters
      • while true; do curl <k8s service>/time?duration=800ms; sleep 1; done
      • Sample output with 800ms duration: http.ingress_http.downstream_cx_length_ms: P0(800.0,1.0) P25(802.5,1.02646) P50(805.0,1.05292) P75(807.5,1.07938) P90(809.0,1.09525) P95(809.5,2.02464) P99(809.9,805.493) P99.5(809.95,807.817) P99.9(809.99,809.676) P100(810.0,1100.0)

Visit https://www.envoyproxy.io/ for a detailed description of the various Envoy metrics and processes. 

Congratulations!  You’ve instrumented a simple Kubernetes application for Envoy metrics.  With these metrics, we can understand how our application is performing under load.  In our next exercise, you’ll utilize Envoy metrics with Opsani to optimize CPU and memory limits for the best application performance, at the lowest cost.

Once you’ve become familiar with Envoy, it’s time to start considering using another tool, Prometheus, to help manage and aggregate Envoy data across multiple services and multiple instances of your services.  For an introduction to Prometheus, check out our post on What is Prometheus and Why Should You Use it?


Cloud Elasticity vs. Cloud Scalability: A Simple Explanation

[ARTICLE]

Cloud Elasticity vs. Cloud Scalability: A Simple Explanation

Cloud elasticity and cloud scalability seem like terms that should be possible to use interchangeably. Indeed, ten years after the US NIST provided a clear and concise definition of the term cloud computing, it is still common to hear cloud elasticity and cloud scalability treated as equivalent. While both are important and fundamental aspects of cloud computing systems, their actual functionality is related but not the same. In the 2011 NIST cloud computing definition, cloud elasticity is listed as a fundamental characteristic of cloud computing, while scalability is not.  Yet, elasticity is not possible without scalability.  The following quote from the NIST definition’s clarification of the essential characteristic of rapid elasticity:

Rapid elasticity. Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.”

This means that the ability to scale a system, which is the ability to increase or decrease resources, is required before a system can be elastic.  Elasticity is the system’s ability to take advantage of that scaling ability appropriately and rapidly to demand. So, a system can be scalable without being elastic.  However, if you are running a system that is scalable but not elastic, then you are, by definition, not running a cloud. Note that the system need not make use of this capability. It just needs to have the capability.

Scaling up and Scaling out

In the figure above, we can see the difference between scaling up and scaling out to increase a system’s resources, in this case, CPU capacity. The converse would be scaling down or scaling in when shrinking resources. The scaling up/down terminology (aka vertical scaling) refers to scaling a single resource by increasing or decreasing its capacity to perform.  As in our example, this could be the number of CPU cores – real or virtual – available in a single server. The scaling up or out concept (aka horizontal scaling), again illustrated here with CPUs, is a matter of adding replicas of resources to address demand. This could just as easily be envisioned as spinning up additional containers or VMs on a single server as the CPU example we are using.  

Again, from a cloud definition, it is not what resource is being scaled, rather understanding how resource capacity is being increased or decreased. Confusion has crept into how the insistence of some uses cloud scaling and cloud elasticity. Each somehow refers to either a characteristic specific to infrastructure or an application.  Elasticity is often artificially tied to infrastructure and scalability to applications.  If we return to the NIST definition of elasticity, it does not explicitly call out infrastructure or applications and instead refers to capabilities. These capabilities are explicitly less critical than the overall system’s ability to adjust to changing needs rapidly.

In truth, what is important to the end-user is not the means but the end. Depending on how much change in demand a system experiences, it is quite possible that adding or deleting application instances can provide the rapid elasticity needed.   The explosion in popularity of Linux containers such as Docker and Serverless/Function-as-a-Service (FaaS) solutions means that applications can be incredibly and quickly elastic without an absolute need to provision additional hardware, real or virtual.  Continued improvement and automation of how hardware is provisioned and de-provisioned – even physical hardware – make integrating the hardware and software to provide even better elasticity increasingly functional and common. 

Moving from “Cloud Scaling” to “Cloud Elasticity”

The extreme of scaling over or under compensates against the realities of production load. As an example, let’s assume we’ve joined a company that just moved a significant legacy application to the cloud.  While the engineering team has done some work to make the app cloud-friendly, such as breaking the app into containerized microservices, we’ve been tasked to optimize its performance. We’ve received some performance data, but not much, and based on the limited data, let’s assume we’ve estimated that our necessary capacity is two servers, each costing at $0.05/hour or $1.20/day or $438/year. We’ve also implemented a more robust monitoring system to provide feedback on parameters such as application performance and server utilization. Unfortunately, we find that our initial static capacity estimate results in one server sitting idle during certain times of the day, costing us $6.00 per day or $2190 per year of excess resource costs. Furthermore, we did not estimate our daily load well enough, and we consistently see outages twice a day.  This could cost us even more in lost revenue than the net cost of infrastructure through failed transactions and lost customers.  What has happened here is a case of underprovisioning resources compared to our actual demand.

In a traditional IT infrastructure, the logical step would be to increase capacity.  And as our CEO and head of engineering see performance initially more critical than cost, we look at scaling the system. You chose whether you wish to scale up or out, but the result is that we increase our capacity to three servers available to our system at all times. Now that we have scaled our system, we’ve eliminated our daily outages and, unfortunately, increased our overall system cost and substantially increased our wasted spending on idle servers to $20.40/day or $7446/year.

It just happens that our company hired a CFO that is really into FinOps, and she realizes that we are treating our infrastructure like it is a traditional IT resource, not a cloud. So we give scaling in response to changing load a try.  Knowing that most of our system’s load was covered by two servers, we scale back down (or in) to that level and set an alarm to page an engineer to scale our infrastructure to meet demand.  Unfortunately, demand spikes and drops rapidly. By the time our very competent engineer has the additional servers online, there have been outages, and it also takes a while to scale back down. 

FO is pleased we’ve cut our idle infrastructure cost in half; she still sees some cost savings that should be attainable. On top of that, our head of engineering and CEO is not pleased that we are again in a state where we are having outages and the work of manually scaling up and down in response to system changes is tedious work.  We have achieved cloud scaling, but are not yet at a point of true cloud elasticity.

Cloud Elasticity to the Rescue

Finally, our team points out that our cloud provider has several automation tools that could tie into our monitoring system and automate the necessary rapid scaling responses to truly let us achieve cloud elasticity rather than merely cloud scaling.  The outcome makes the CEO, CFO, and head of engineering happy with the entire team and further has eliminated the toil for your team of manually responding to load changes. 

Because the process is automated, the response to changing loads is appropriate and rapid, resulting in eliminating outages and idle servers. Now that things look automated and stable, the CFO points out that there are times where server capacity is not optimal, and it might be time to look at that, but that will need to wait for another post. 

Is Cloud Elasticity Required?

Now early in this article, I noted that not just elasticity, but “rapid elasticity” is required, by definition, for a cloud actually to be a cloud.  Does this mean that your system MUST be elastic? In truth, no, it just needs to have the ability to be elastic to be a cloud system.

If you are running a service tied to retail sales, and seasonal events such as Valentine’s Day, Christmas, or Black Friday/Cyber Monday spike the demands on your systems. This alone might warrant making sure your system has its cloud elasticity functionality ready to go.   If, on the other hand, you are serving business software to small companies that have predictable growth and use rates throughout the year, elasticity may be less of a concern.  Indeed the question might further be, do you need to run your system on a cloud? 

Still, the point of cloud computing can be distilled down to another one of the NIST “essential characteristics” of cloud computing – self-service, on-demand access to resources.   The uncertainty of the on-demand requirement makes cloud elasticity – and rapid elasticity at that – necessary. If your service has an outage because of insufficient resources, you’ve failed your end-users, and having elasticity working on your system is the prudent choice.

Conclusions: Cloud Scalability AND Cloud Elasticity

Hopefully, you are now clear on how your system’s ability to scale is fundamental but different from the ability to quickly respond – be elastic – to the demand on resources.  Being able to scale has no implications about how fast your system responds to changing demands. Being elastic, especially in the context of cloud computing, requires that the scaling occur rapidly in response to changing demands.  A system that exhibits true cloud elasticity will need to have scalability and will likely be automated to avoid the toil of manual action and to take advantage of the responsiveness provided by computer-aided processes.


Amazon WorkSpaces, What's in it for you?

[ARTICLE]

Amazon WorkSpaces, What’s in it for you?

Amazon WorkSpaces is an Amazon Web Services (AWS) Desktop-as-a-Service (DaaS) tool that allows a business to provide users a remote, virtual desktop. The user experience is equivalent to having logged into their own (work) computer to access applications, services, and documents. With Amazon Workspaces, a user’s work environment is isolated within the virtual environment, but can still be accessed through a BYOD model. Using Amazon WorkSpaces removes the complexity, high cost, and security vulnerabilities of managing an on-premises system, such as a virtual desktop infrastructure (VDI) solution as AWS manages the DaaS infrastructure and service.

How do you use an  Amazon Workspace? 

Amazon WorkSpaces is managed by your company IT team via the AWS Management Console. A WorkSpaces bundle is assigned to each user by an IT administrator.  The WorkSpaces bundle defines the resources – application, compute, and storage – that are available to an end-user. Once the user’s WorkSpaces bundle is defined and assign, the user uses the client application to connect with any supported devices, which could be a laptop, desktop, or tablet. 

What are the Major Benefits of Amazon Workspaces

1)Simplify Desktop Delivery

I have personally, and repeatedly, experienced the 1-month wait when “working” with a new company while a dedicated laptop was delivered and then configured to work with a company’s system.  While a company could still require a dedicated piece of hardware for work activities, Amazon WorkSpaces simplifies and accelerates provisioning, deploying, maintaining, and recycling desktops and that translates to less work for IT and faster onboarding. The SaaS model removes the need for managing a complex virtual desktop infrastructure (VDI) deployment. As Amazon WorkSpaces support the BYOD model of deployment, it can also reduce hardware management concerns.

2) Reduce Hardware Costs

As long as your company is willing to employ a BYOD model to run Amazon WorkSpaces, you can remove the need to invest in desktops and laptops for employees and contractors. Because the Amazon WorkSpace solution is cloud-based, desktops can provide a customizable set of compute, memory, and storage resources to specifically meet your users’ performance needs.  Changing the performance of allocated resources is no longer a matter of upgrading hardware, but rather simply a matter of updating the WorkSpaces bundle configuration.

3) Keep Data Secure

We’ve all heard stories of hardware with proprietary data being confiscated at border control or simply stolen. Because no user data persists on the user’s device, the risk surface area is greatly reduced. Amazon also applies its standard and robust security practices to Amazon WorkSpaces.  A WorkSpace is deployed within an Amazon Virtual Private Network (VPC) to start.  User data is stored on persistent, encrypted AWS Elastic Block Store (EBS) volumes. The service also integrates with AWS Key Management Service (KMS) to allow admins to manage encryption keys. The service further supports an Active Directory integration and the use of AWS Identity and Access Management and multi-factor authentication.

4)Consolidate your worldwide desktop management 

If your company has a global presence, the ability to access Amazon WorkSpaces in any of AWS Regions may provide great value. WorkSpaces supports the management of thousands of high-performance cloud desktops on a global scale. The benefits here include not just the toil of setting up and deleting worker desktops in an international setting, but also the very real issues of hardware management.  The speed at which Amazon WorkSpaces desktops can be provisioned, reconfigured, and deprovisioned adds agility and speed to a company’s ability to respond to changing business needs.

What are the major downsides of Amazon Workspaces?

As with all tools, it is important that the functionality of Amazon Workspaces maps appropriately to your use case. 

5)Internet connectivity required

As a cloud (SaaS) service, the ability to connect to the Amazon Workspaces desktop will require access to a reliable internet connection.  Many of the benefits of being able to connect to performant remote resources are negated if the end-user starts downloading data locally to manage poor network performance.  Storing and using proprietary data on a physical device also affects the security benefits of keeping all business information on the remote WorkSpace.

6)What about VDI?

The Virtual Desktop Infrastructure model has long been the traditional way to run virtual desktops.  Because these systems are built on a central server and managed in-house, there is, potentially, an added level of security and control. However, an internal system lays the responsibility for hardware and software issues on the IT team running the system – and requires an internal IT team to run the system! Although Gartner predicted a substantial shift from VDI to DaaS back in 2016, this has not happened as predicted.  The reason likely being the typically monthly per-user cost of DaaS solutions.  Deciding on the financial viability of a DaaS vs VDI will depend on your specific use case.

Conclusion

Even with the slower than expected shift from VDI to DaaS solutions like Amazon WorkSpaces, the changes wrought on businesses and employees during the COVID-19 pandemic have radically changed many business landscapes.  The accelerating and likely increasingly permanent, trend toward distributed workforces gives a DaaS solution an edge over an in-house VDI solution.  The ability to rapidly onboard and adjust access to resources that Amazon WorkSpaces provides increasingly maps to the use cases that company’s currently find themselves facing. The combination of BYOD convenience, rapid onboarding, resource allocation controls, and high security all point to Amazon WorkSpaces having a bright and increasingly important future in the modern business environment.


How to Use COaaS & OLAS for Optimal Application Performance

[VIDEO]

How to Use COaaS & OLAS for Optimal Application Performance

Learn how Continuous Optimization as a Service (COaaS)combined with OLAS allows users to tune their application for optimal performance at scale optimally. Continuous Optimization as a Service (COaaS) will rightsize your resources to continuously and autonomously tune workloads with sophisticated AI algorithms. In addition to COaaS, OLAS (Opsani Learning AutoScaler) extends COaaS with autonomous, intelligent autoscaling that automatically learns traffic periodicity, trends, and burst behavior.

Request A Demo

How to Take Advantage of AWS Fargate Pricing

[ARTICLE]

Take Advantage of AWS Fargate Pricing

AWS Fargate is a serverless compute engine for containers that work with Amazon Elastic Container Service (ECS). It is also available in a subset of regions for using Amazon Elastic Kubernetes Service (EKS). Fargate removes the need to provision and manage servers (that is the serverless part, AWS deals with that for you), and the Fargate pricing model lets you specify and pay for resources per application.

Fargate launches and scales the compute resources to match your resource specifications for each container closely. Because Fargate manages the compute aspects for you on a per-container basis, there is no longer a need to pick the right instance size and manage cluster scaling. It also eliminates over-provisioning either due to choosing oversized instances or by having idle instances on hand.  

AWS Fargate Pricing

AWS Fargate pricing is calculated based on the time of use of the resources actually used, specifically vCPU and memory.  Costs are calculated on resources used right from when you start to download a container image until the point when the Amazon ECS Task or Amazon EKS Pod terminates.  Timing is rounded up to the nearest second.

Fargate Pricing Options

If you have worked with EC2 instance pricing before, how Fargate pricing works will feel familiar. There are no upfront payments for Fargate, and you are only charged for the resources you actually use. There is the default, on-demand pricing, and there are Spot and Compute Savings Plan pricing options. If your application is tolerant of being interrupted and restarted, the Fargate Spot option can let you realize up to a 70% discount compared to on-demand prices. If you can commit to spending a certain amount on Fargate over a certain time period to run persistent workloads,  the Compute Savings Plan can let you realize up to a 50% discount compared to on-demand prices. Note that this commitment uses a specific amount of compute (dollar amount per hour) for a one or three-year term.

Tuning Fargate Pricing 

Even though you no longer need to worry about specifying the number of instances you are running, it is possible to optimize cost by looking closely at vCPU and memory resource requests for the Task or Pod. These parameters are pulled from container resource requirements, so if you are overly generous with your resource allocation there, you may still be paying more than you need to.  The two dimensions are independently configurable, as you can see in the supported configuration table below.  Monitoring your application in production can provide you the insight to adjust resource requirements appropriately and then consider Spot or Savings Plan options to save even more on your Fargate pricing. There is a simple tool consisting of a  CloudWatch Dashboard template (here on GitHub) that can help you with right-sizing when used with AWS CloudWatch Container Insights. Do read the section on CloudWatch “gotchas” below first, though.

Supported Fargate Configuration Options

AWS Fargate Pricing Gotchas

As often happens with public cloud services, sometimes you end up paying for extras that you may not have factored into your overall cost for running your apps.  Data transfer can be one gotcha as AWS charges minimally for most storage services when the data is at rest.  Moving data, data transfer can be pricey, and if your Fargate Task or Pod moves a lot of data, this could add substantially to your overall operations cost.  You can check out AWS data transfer rates to get a sense of your potential costs and options.

Also, if your app uses other AWS Services (e.g., CloudWatch for general logging or Container Insights), you will get a bill for your CloudWatch service use as well.  One way to get an estimate for your overall Fargate costs is the AWS Pricing Calculator.  

Conclusion

Fargate has the potential to simplify your operations by eliminating instance and cluster management and can realize substantial savings, especially if you take advantage of Fargate pricing discounts like Spot pricing and Comput Savings Plan options.  Rightsizing your container resource requests and limits can further reduce costs.  Do pay attention to interactions with other AWS services and consider data transfer rates, as these can contribute to the overall cost associated with running AWS Fargate.


Spinnaker vs. Jenkins for Continuous Deployment

[ARTICLE]

Spinnaker vs. Jenkins for Continuous Deployment

Continuous Deployment (CD) is a process that automates the deployment of code, sometimes referred to as an “artifact” into service.  While this could be deployed into a dev environment, it is most commonly deploying code into production, so as you may imagine, having a reliable process is vital.  Jenkins is a venerable, open-source CI/CD solution, while Spinnaker, precisely a CD solution, is a relative newcomer custom built by Netflix and open-sourced. 

Continuous Integration and Continuous Deployment (CI/CD) is a vital part of DevOps processes as it enables automation of application development (CI) and application deployment (CD).  While we’ve previously covered CI/CD in greater detail for this article, CI is the application development process that produces a deployable code artifact. CD is the process that takes that artifact and deploys it into a server environment.

While it has not been uncommon to cobble together CD processes with tools like Chef, Puppet, Ansible, or Salt, this approach often includes manual methods or the need to maintain scripts. Jenkins original functionality was a build server designed to provide CI functionality. A set of Pipeline plugins have extended Jenkins’ abilities to also function as a CD tool. Spinnaker is not a build tool but was specifically developed to support the CD process and is focused on working in cloud environments and at scale.  In this article, we will specifically consider the CD functionality that both tools provide.

Jenkins

There is little doubt that Jenkins is the most widely deployed automation server. Jenkins can run standalone on a server with a Java Runtime Environment (JRE) installed or run as a Docker container.  A host of Jenkins plugins extend its functionality beyond CI, and the Pipeline plugins, in particular, enable CD function. Some of Jenkins’ key CD features include:

  • CI and CD – Many users consider Jenkins as “just” a CI tool; however, it is an “extensible automation server,” so it readily provides CD functionality with Pipeline plugins. Releases since v2.0 have explicitly included CD targeted functionality.
  • Plugins – A significant way that Jenkins extends and integrates with function is through over 1500 plugins. Key integration categories are build and source code management, UX modifications, and administrative controls.
  • Distributed deployments – Speeds up CI and CD processes by distributing jobs across multiple servers.
  • Multiple use-cases – Plugins allow Jenkins to support a wide range of platform-specific use cases such as Java, Ruby, Python, Android, C/C++, PHP.

 We are focusing on Jenkins’ CD capabilities, the popular Blue Ocean UX plugin for Jenkins pipelines. Pipelines themselves are a suite of plugins that enable CD with Jenkins.  Pipelines are defined with text-based scripts; however, Blue Ocean allows users to visualize the CD pipeline process and results in real-time. 

An example of the UI from the Blue Ocean plugin documentation displaying the tests have failed or passed, and part of the test scripted that failed.

Spinnaker

Netflix open sourced Spinnaker on GitHub in 2015.  Major cloud providers (Google, Amazon, Microsoft) quickly joined to support its development. Netflix and Google were the primary managers and drivers of development until creating a community governance system in 2018. Here is a selection of some key Spinnaker features:

  • CI Integrations – Not surprisingly, as a CD specific tool, CI integrations with Jenkins or TravisCI are available and can trigger the CD process.
  • Multi-Cloud – As a cloud-native tool, Spinnaker was designed to easily deploy across multiple cloud providers, including AWS (EC2), OpenStack, Kubernetes, Google (Compute Engine, Kubernetes Engine, and App Engine), Microsoft Azure, and DC/OS. 
  • Automated releases – This is a core function, and it is possible to run both integration and system tests in your deployment pipelines. A range of pipeline trigger events (e.g., via CRON, a git event, Jenkins, Travis CI, or another Spinnaker pipeline)
  • Deployment strategies With built-in deployment strategies define how an application rolls out. Built-in strategies include highlander and red/black (a Netflixism that the rest of the planet would call blue/green), rolling red/black, and canary. Of course, you can also define your custom strategy.
  • Integration monitoring – Monitoring service integrations (Prometheus, Datadog, Stackdriver) can trigger pipeline events, such as rollbacks, and provide the necessary canary analysis measurements.
  • Manual judgments – If you prefer continuous delivery over continuous deployment, you can use a manual judgment stage to require a human’s approval before deployment.
  • Chaos Monkey integration – An excellent value add that tests application resilience during instance failure. 

One of Spinnaker’s strengths is a robust set of deployment models.

Spinnaker vs. Jenkins

As Spinnaker is explicitly a CD tool, and Jenkins is the primary CI tool, it is not unusual to see both used together.  Spinnaker even has a native Jenkins API that makes the integration simple.  Still, the question we are delving into here is which application provides the more performant CD solution. Let’s looks at some of the differences in how Here are some of the key differences between them:

  • Spinnaker is a deployment tool, is not a build tool. This is Jenkins’ strength, which is why integrating the two (or integrating with another CI tool) is necessary to build a CI/CD pipeline with Spinnaker.
  • Jenkins is built for CI, and CD needs to be enabled through native plugins, scripting (Ansible, Puppets, etc.), or integration with another tool, like Spinnaker.
  • Jenkins was not designed for cloud deployments (Though JenkinsX is a cloud-native, Kubernetes-focused solution). Jenkins plugins or external scripts are required to enable cloud integrations. While the extensive plugin catalog does address many use cases, this does not guarantee your cloud-specific use case has a plugin available.
  • Spinnaker is a cloud-native, cloud-agnostic application with built-in functionality that allows on-demand infrastructure and application changes. It delivers highly-available, multi-account, multi-cloud artifact (code) deployment reliably and predictably. Spinnaker can create deployments, load balancers, automate version rollouts and rollbacks, resize clusters, etc. It also provides a web UI to both manage deployment processes and the creation of primary infrastructure resources.
  • Spinnaker provides the ability to manage deployments across cloud providers with a unified dashboard to handle process status and deployments across multiple cloud environments. 
  • Spinnaker provides a richer set of deployment models than Jenkins.
  • Jenkins is well known to be a temperamental application, in part due to its reliance on scripts for functionality. Spinnaker is built to provide a simple, reliable user experience.
  • The tradeoff for Spinnaker’s simplicity and ease of use is that it provides less fine-grained, resource-level access controls when compared to the functionality provided by Jenkins’ native configs and plugin extensions.

Conclusion

So, do you feel like there is a clear winner? I find that as more companies either start as cloud-native or move to the cloud, the simplicity and reliability, along with powerful integrated deployment models, give the edge to Spinnaker. This may be why it is not uncommon to see Jenkins used as the CI solution with Spinnaker providing the CD tooling.  Suppose you are already using Jenkins as your CI/CD solution. In that case, I am not sure that rebuilding your CD processes with Spinnaker would provide much of a benefit. If you move from a traditional or private cloud environment, switching to Spinnaker will likely speed up and simplify the process.


Understanding Amazon Web Services EC2 Pricing Options

[ARTICLE]

Understanding Amazon Web Services EC2 Pricing Options

Amazon Web Services (AWS) Elastic Compute (EC2) instances provide the AWS compute service. As with all things AWS, there are many options available regarding how to pay for this service. Let’s look at the options and when they might provide the best value.

Free Tier

For completeness, I will mention the AWS Free Tier. This gives you free access to up to 750 hours a month of Linux and Windows instances for up to one year.  These are the smallest EC2 instances available–  t2.micro in most cases, though you can get  t3.micro instances in regions that do not have t2.micro instances. If you use another instance type, you will start getting charged for all EC2 instances you are using, including the .micro instances.  This is a great service to try out AWS, but it is unlikely that you will be satisfied with this for a production system, so let’s consider what you get when you start paying for your EC2 services if you want to learn more about AWS’ free services here.

On-Demand

On-Demand instances provide the greatest flexibility and control over spending, though, as we will see, they may be more expensive than other payment options, depending on your use case. 

Depending on the specific instance type, AWS will bill you for your EC2 instances by the hour or the second. You only pay for the time per instance that you actually use, and you can freely change the type of instance(s) you are running.

On-demand is the right choice when an upfront payment commitment for a certain amount of compute resources does not make sense. This payment model is useful in application development/testing cases where you are not certain about the best EC2 instance to use for an application. If your workloads cannot be interrupted (see Spot Instance pricing for comparison) are short-term use, highly variable, or unpredictable, on-demand instances are the way to go. 

Spot instances

Spot instances use “spare” Amazon EC2 computing capacity at up to 90% cheaper than the same On-Demand EC2 instance. This allows Amazon to put resources to work currently idle but are actually reserved for another service. While it may or may not happen, if the service that has the instance reserved needs the instance your application is using, your use is terminated.

If you are running an application that is tolerant of potential interruption/failure, using spot instances has the potential for providing substantial cost savings. They can also be a way to scale up an application that suddenly needs a great deal more compute capacity. If your budget is currently limited and you wish to try some of the more expensive EC2 instance types or use more capacity than your budget would allow using on-demand instances, spot instances can also provide a solution. 

Savings Plans

Savings Plans can provide up to 72% discount on EC2 instances compared to equivalent On-Demand instances. The requirement is a commitment to a consistent amount of hourly use for a period of 1 or 3 years.  This can provide great cost savings if you know your applications’ workload profiles and are willing to commit to a longer-term contract.  Using a Savings Plan to cover your baseline workload along with On-Demand or Spot Instances for periods of increased compute demand can provide a balance between cost savings and workload responsiveness.

AWS has a tool, the AWS Cost Explorer, that will help you figure out your actual resource usage and understand where you might benefit from a Savings Plan to purchase your EC2 resources. In the example below, from the AWS Cost Explorer overview page, it is possible to see that most resources for this example system have a fairly constant workload and could see substantial cost savings if a 1 or 3-year commitment makes sense.

Reserved Instances

Reserved Instances are similar to a Savings Plan but are assigned to a specific Availability Zone. Reserved instances guarantee you required resource access in the AZ you have chosen. They can cost up to 75% less than the equivalent On-Demand instance. Much like the Savings Plan, if you have fairly steady workloads and can commit to a 1 or 3-year contract, this model may save you money.  

Dedicated Hosts

A Dedicated Host gives you dedicated access to a physical EC2 server. This means that you are the only “tenant” that will have access to that particular machine. If you are using software that requires their licenses to be tied to a single machine (e.g., Windows Server, SQL Server), the Dedicated Host can provide the necessary single server compliance and save on cost.  The service comes integrated with AWS License Manager.  The manager service ensures that the Dedicated Host instances are compliant with license terms and launch on appropriate instance types. Dedicated hosts can be purchased using the hourly On-Demand model or as a Reserved instance for up to 70% less than the On-Demand cost. 

Summary

While on-demand instances are typically what people start out using, substantial cost savings can be realized using other available EC2 purchase models. Spot instances can provide on-demand-like flexibility at substantial savings as long as your application can tolerate being preempted.  Both Savings Plan and Reserved Instances can save money while providing you the necessary resource coverage for consistent workloads over time. Dedicated Hosts provide a way to launch EC2 instances with software that has licensing restrictions in an On-Demand or Reserved pricing model. Amazon’s integrated Cost Explorer can be used to provide the necessary insight into your system’s resource use to make decisions about which purchase model or combination of models can help you optimize your AWS cloud spend. 


Here's Why Manual Workload Tuning is Obsolete

[WEBINAR]

Here's Why Manual Tuning is Obsolete

Manual workload tuning is reactive and takes several weeks to tune. Enterprises are over-provisioning their applications as they cannot do tuning on the scale that is needed. This results in massive waste, sub-optimal performance, and lower availability of the application. To learn more watch our last webinar Using Machine Learning to Optimize All Applications Across the Delivery Platform

Request A Demo

Adaptive Tuning for Load Profile Optimization

[ARTICLE]

Adaptive Tuning for Load Profile Optimization

Initial application settings are generally derived from experience with similar systems’ performance or overprovisioned to head off anticipated performance bottlenecks.  Once an application is running and actual performance metrics are available, it becomes possible to tune parameters to more appropriately assign resources to balance performance requirements and cost. In simple, stable systems, this cycle of measure, evaluation, and improvement are relatively straightforward to apply. 

To break this basic tuning steps out more explicitly:

  1. Establish values that provide minimum acceptable performance (a Service Level Objective (SLO))
  2. Collect metrics on system performance
  3. Identify the part of the system that is limiting performance (e.g., CPU, memory)
  4. Appropriately adjust the part of the system, causing the bottleneck.
  5. Again collect metrics on system performance.
  6. If system performance improves, keep the modification; if it degrades performance, reverts or tries a different adjustment.

While relatively simple to apply in simple and stable systems, as system complexity increases, the number of potential performance-impairing bottlenecks increases as the overall performance depends on inter-service interactions. Process automation becomes important as relying on human intervention to maintain SLOs becomes overwhelming and may not adjust system performance quickly enough to meet SLOs in a dynamic environment. 

Cloud computing systems and the common microservice architectures of cloud-native applications have the ability to automatically scale to maintain performance SLOs in the face of variable loads.  Increasing loads can trigger the system to scale up resources to maintain performance. Decreasing loads can trigger a scale-down of resources to levels that still maintain performance and remove the cost burden on idle resources.

Database and big data applications have been at the forefront of understanding and automating the process of adaptive tuning. Herodotou and collaborators identify six approaches to performance optimization: 

  • Rule-based –  An approach that relies on the system behaving as based on prior experience with similar systems.  This does not rely on metrics/logs or a performance model; this will provide initial settings to get started but is unlikely to provide optimal performance settings.
  • Cost modeling – An analytical (white-box) approach based on known cost functions and understanding of the system’s internal functions.  Some form of the performance metric is required to develop the predictive model.
  • Simulation-based – A model to predict performance is generated from a set of experimental runs that simulate load scenarios (e.g., using a load generator) and evaluating optimal parameter settings. 
  • Experiment-based – Search algorithm-led experiments with varying parameter settings are used to identify optimal settings.  
  • Machine learning-based – A black-box approach that generates predictive performance models that do not rely on internal system functionality knowledge.  
  • Adaptive – Configuration parameters are tuned on a running application using any number of the methods listed above.

While any one of the above approaches can be used to tune a system’s performance, doing so effectively will likely leverage several approaches as the “Adaptive” category suggests.  While rule-based methods can be a quick and dirty way to provide initial conditions and, if those rules include the ability to adjust the application resources in response to workload changes (e.g., Autoscaling thresholds), the result is an adaptive system.  Combining AI methods with rules-based methods can improve adaptability by adding a predictive ability level (e.g., AWS Predictive Autoscaling for their EC2 service).  Indeed, combining rule-based with ML-based approaches best addresses the need to adapt to both changing workloads and systems changes.

While rules-based auto-scaling can adapt to workloads changes, the next question you may wish to ask is whether the application profile is configured optimally. As the application and possibly the supporting infrastructure are scaling in response to load, are resource settings such as CPU, memory, and network configured to perform as load requirements change optimally? The challenge here is that adaptively tuning your system configurations becomes exponentially more complex as you keep adding tunable parameters.

Increasingly, the ML approach to adaptive tuning is becoming not only possible to apply but is almost a prerequisite to achieving true optimization. Peter Nikolov, Opsani‘s CTO and co-founder, gave a presentation “Optimizing at Scale: Using ML to Optimize All Applications Across the Service Delivery Platform (Las Vegas 2020),”  in which he pointed out that one application that had eight settings for two tunable parameters for twenty-two component service would have an 822 (74 quintillion) possible tuning permutations. This is outside of the scope of the human ability to search for a truly optimum, but, in this case, the Opsani machine learning algorithm was able to rapidly search and identify the settings that provided both optimal performance and lowest cost.

If we now add considerations of variations in the workload itself, effective adaptive tuning with the machine learning approach starts to need not just an adaptive but also an autonomic approach.  Oracle’s Autonomous Database and Opsani’s Continuous Optimization service are examples of continuous adaptive tuning in action.  The ability to appropriately respond to changes in the system without human intervention removes the drudgery or toil from searching and implementing (hopefully) optimal configuration settings; it also greatly reduces response time in applying the appropriate optimum.

The six categories of performance optimization can be viewed as an evolutionary approach to adaptive tuning. Rules-based approaches will get you up and running and can be applied without any actual performance.  With the increasing ability to get performance metrics and apply appropriate modeling techniques, discerning and applying performance-improving changes become more rigorous and complex.  Eventually, applying machine learning approaches to evaluating data and automating tuning the system allows the rapid discovery of optimal settings even in the face of changing workload, application, and system requirements.

If you would like to experience AI-driven application optimization for yourself, Opsani offers a free trial.