At Opsani, we cut our users’ cloud costs. Often by as much as 70%. Right now, to help companies deal with the economic disruptions of the COVID-19 pandemic, we’re offering this service free for three months

Why is there so much cloud overspend? Because people tend to spin up a lot of services and keep them warm and running, just in case some sort of request comes in. Companies have thousands of services in play, but often they’re 90% idle. 

Real cloud optimization cuts out the need for this overprovisioning. How? As with other tricky concepts, an analogy can help. 

Differentiating Between Idle Time and Production Time

Here’s a useful way to think about cloud optimization (and the consequent cost-cutting) takes place. Imagine you’re driving a car. When your car is idle – when it’s sitting in your garage, or sitting in traffic, or sitting at a red light – it’s not being productive. It’s not getting you anywhere. It’s not serving its intended purpose. All it’s doing is gathering dust and slowly deprecating in value – or, often, it’s consuming gasoline. Which means it’s costing you money. The car is only being productive when it’s fulfilling its purpose; when it’s transporting you from place to place. 

Now, imagine you’re not just a driver, but an Uber driver. When you’re transporting people, you’re making money. But when you’re idle, or it’s just you in the car, you’re not making money. Here’s what would be ideal: for the overall costs associated with your car to reduce when it is idle or not in productive use. As the operating costs come down, your profit per transaction goes up.

With physical assets like cars, this sort of dynamic right-sizing is very difficult. Ideally, when the car isn’t performing an Uber trip, the engine would shrink. The back seats would vanish, bringing the vehicle weight down. Gasoline would stop being consumed. The operating costs – the cost per hour of owning the car – would instantly become minimal. 

But to borrow the old coding lingo: here’s the difference between atoms and bits. With a car, ideas like these are fanciful and comical. But with cloud assets, an analogous reconfiguring is easy. Like an Uber vehicle, a VM works within transactions: discrete, observable, measurable units of activity that take place in a fixed amount of time. But a VM is made of bits, not atoms. This makes its transactions far more malleable. When it’s idle, we can shrink down that unit of production into a much smaller, more streamlined unit. While it is not performing any transaction, its cost per hour is slashed. 

This is exactly what the Opsani engine does. To begin with, it observes what a particular VM or workload or container is doing. It establishes what its transaction type is: database commits, API reads per second, or whatever else. When we have the transaction type and its baseline performance, we can calculate system cost. 

With this information, we can begin the optimization routine, utilizing the flexibility of bits. We set a performance goal of desired transactions per hour. And then – just as an Uber driver would ideally start shrinking his car when he doesn’t have a passenger – the Opsani engine begins shaving resources that are not achieving the performance goal. As we reduce resources, the cost comes down. And profit per transaction goes up. 

With every VM or workload or container, you constantly reconfigure each bit to the smallest amount possible. And you stay alive so that when a request comes in, you scale up immediately, and start consuming more resources in order to satisfy your performance goals. You aren’t stuck with a big, bloated VM draining resources. Instead, you just have a small container that lies in wait. When the request comes in, it morphs into something bigger, that can handle the workload. 

The equivalent here would be the Uber driver cruising in a smart car waiting for a customer, and then transforming into a Chevrolet Suburban when a large family needs a ride. If an Uber driver could do this, their profit per transaction would soar. Unfortunately, they are constrained by atoms.

But with the freedom of bits, the Opsani engine discovers the optimal level of operations, where we’re delivering the performance the customer requires at the lowest possible cost. Multiply the savings by the number of transactions per time period, and we can see the total number of savings for the operational state, or what we call the scaled state (the state that is required for operations). Take the difference between the idle time cost and the production time cost, and multiply that per transaction, and we can see the macro cost savings.

Unfortunately, Uber drivers cannot shrink and expand and shrink and expand the asset of their transaction. (Yet: Probably Elon Musk has ideas.) But in the cloud, with bits not atoms, this sort of dynamism is easy. Once you can understand how often things are in production time versus how often they’re idle, then you can make informed decisions on how to best optimize your infrastructure. You can pay only what you need to, and still have a very responsive system.