True Cloud Performance

Cloud performance optimization is a no-brainer. Pulling it off means simultaneously saving money and boosting performance. At Opsani, we have leveraged neural net and deep reinforcement learning technologies to construct an AI that continually optimizes cloud app performance. It is the perfect tool for any enterprise with cloud-based, medium-to-large applications that clocks less than $5M/yr cloud spend and plans moving toward DevOps/CloudOps.

However, cloud performance optimization for apps isn’t a cookie cutter process. How you treat your apps should depend on their maturity, lifecycle stage, and scale. Moreover, your app must tick off several key elements for cloud optimization before it is considered “mature enough” to undergo cloud optimization. Until they reach that stage of maturity, apps aren’t actually ready for full-blown cloud performance optimization. Let’s call that final, CO-ready point of maturity Stage Five. Shepherding all of your apps toward Stage Five-maturity should be your standard, targeted trajectory. 

Here are the four preceding stages, and how to move through them to a point of cloud optimization readiness.

Stage One: Don’t Optimize, Scale Out

You are early in development, and the key elements for cloud optimization are not entirely there yet. Here, especially with new products, cloud performance optimization is a bad use of engineering time. 

According to a Startup Genome research report, 74% of startups don’t succeed due to premature scaling. Which makes sense, because scalable applications do allow rapid advancement, higher ROI, and business continuity. Moreover, a scalable app can handle sudden increase in traffic and provide positive user experiences.

Be agile and design your services to scale out. Here, you can throw resources at the problem if you need to; resources are cheaper at this time, compared to engineers. Invest in automatic scale-out solutions, and keep headroom. Continue this way until it starts to get expensive, and/or the response time of a single request gets too high. 

Stage Two: Monitor Production Performance

Your apps are a little more mature now. At this stage, you need to identify your production performance metric(s), whether throughput, response time, error rate, or something else. You need to pinpoint how these affect your business as accurately as you can (involve customers in this if it feels right). 

You should be monitoring cloud app performance and production environments using easy and affordable SaaS-based monitoring services. Don’t overdo the monitoring: start small while focusing on the big things. Pay attention around deployments, and pinpoint bottlenecks once they appear. You can identify and fix those issues yourself at first, but have a plan for adding triggers and notifications. Eventually, there will just be too many services to monitor manually.

Stage Three: Add Performance Testing to Your CI/CD Pipeline

Your scale and maturity is now such that you should define and implement a performance test suite. You have three options here: Utilize an existing load generator; capture and replay production traffic; or build a custom load generator. 

You should make performance regression tests a part of your CI/CD pipeline. Report results and overlay them with other development process metrics. Pay attention to measurement precision/repeatability. When performance regressions are caught,  rollback deployment and return to a developer to fix, or allow for executive override. 

It’s also important that you do not “surprise” your systems with a sudden “big bang” test against the production environment. This was a lesson learned the hard way by a well-known retailer who, during Black Friday, suffered an outage in production, which cost them share prices and potentially millions in revenue. Shane Evans, Senior Product Manager at HP, wrote an article in TechBeacon explaining what happened and how it could have been avoided with “a series of tests, starting with internal systems and using usage parameters pulled from the previous year’s event, and starting several weeks prior to a big bang test against the production environment.”

You can press on this way until runtime resources become too expensive, and/or until your developers begin to seriously grumble about frequent rollbacks.

Stage Four: Application Performance Management

You are nearly ready for cloud performance optimization for your apps. At this stage, you need to instrument your code. (This process is usually language- and framework-specific.) Alongside this, you need to dedicate a team to application performance management (APM). The team can be small at first. They should learn the tools of APM and look for consistent improvements over a six-month window. 

However, it’s important to note that using your APM tools is not true cloud performance optimization yet. That comes later. 

Your aim here should be targeted code improvements and suggestions for where improvements can be made when nothing else is working. Evangelise your early successes in order to engage other developers. 

Keep your APM contained, but continue until the codebase becomes too volatile to work with, ie. until there are major architectural changes, a disruptive migration to microservices etc. 

Stage 5: Cloud Optimization

Once you’ve passed through these prior four stages, you are ready for cloud optimization.

To prepare for the integration of a CO tool like Opsani, you should identify, at a top-level, the key performance tuning dials that you want to turn. These could be resources: CPU, memory: reserve or limit; VM instance type; I/O throughput. They could be various nodes of middleware configuration: JVM GC type/parameters, worker threads, pool sizes, write delays. They could be kernel parameters: page sizes, jumbo packet sizes, even scheduler tweaks. Or they could be an individual app’s parameters: thread pools, cache timeouts, memory-vs-cpu tweaks and tradeoffs.

Download Continuous Optimization Whitepaper

Once you have an idea of your key performance dials, it’s time to choose where the CI/CD pipeline you want to add your tuning. You have two primary options: Tune in staging only, using a perf-test-suite, and only propagate results to production programmatically. Or, tune in production (perhaps a canary environment), which feels riskier, but requires zero-downtime deployments.

And once you have all of this figured out, you are ready for cloud optimization. Meaning reduced cost, and improved performance, for the rest of your app’s lifetime.

Many apps aren’t ready for full cloud performance optimization just yet. But if you mature them in the correct way, and follow these best practices, you can make sure that they are always evolving toward a state where they can easily be optimized. As you scale, this will reap major benefits in the long run.

Contact us today for a free demo. Or read our “What is Cloud Optimization?” whitepaper here for more information.