How is Opsani Different?


How is Opsani Different?

We are the only solution on the market that has to ability to autonomously tune applications at scale, either for a single application or across the entire service delivery platform. We can do this simultaneously and continuously. Another differentiator is we can auto discover the workloads as they are coming on board and then be automatically able to deploy the SLOs to those applications and start the automation of that tuning process. Tune in to learn more! Also, check out our last video What are Opsani’s On-boarding Requirements!

Request A Demo

Cloud Cost Monitoring is Key To Kubernetes Resource Management


Cloud Cost Monitoring is Key To Effective Kubernetes Resource Management

If you are running Kubernetes clusters, then you need to have cloud cost monitoring set up to fully maximize your Kubernetes resources, optimize your platform, and drive down expenses.

To truly appreciate the complexity of managing Kubernetes resources and costs, you just have to take a look at the different variables that impact the cost. This list includes recovering abandoned resources, optimizing instance sizes, choosing the right platform (EKS vs GKE vs others), and more.

Visibility with Opsani

Before you can initiate any optimization effort, your first priority is to gain better visibility into the prevailing usage and costs of resources. Opsani has the technology to automatically help you achieve visibility into your Kubernetes resource usage and spending.

With Opsani, you can get a dashboard that allows for real-time tracking and reporting. This makes it easy for your organization to closely follow the costs.

Opsani Cloud Cost Monitoring Dashboard Overview

Opsani’s cloud cost monitoring dashboard gives you a clear and comprehensive picture of your application performance and spending on Kubernetes resources. Opsani’s custom cloud cost monitoring dashboards are designed and built to seamlessly compatible with both Google Kubernetes Engine (GKE) and Amazon Elastic Container Service for Kubernetes (EKS) clusters. 

Cluster-level metrics enable users to pinpoint high-level cost trends and follow spend across production versus development clusters. Metrics at the node level help you see and compare hardware costs, which is quite useful if you are running node pools using various instance types. Lastly, namespace metrics aid with the comparison and allocation of costs across disparate departments and/or applications.


The screenshot above shows metrics that DevOps and SRE (Site Reliability Engineering) teams deal with on a regular basis. The dashboard provides you with not just the visibility, but also the staging ground where you can find opportunities for cost optimization. Every piece of information delivers insights that you and your DevOps/infrastructure teams can use to delve deep into workloads, traffic patterns, resource constraints, and other factors that impact your cluster costs. Optimization options in this situation range from vertical pod autoscaling to moving a section of compute to preemptible spot instances.

All metrics and graphs in your dashboard are crucial to managing cluster resources and optimizing cloud costs. With our guidance and technology, you have everything you need to cost dashboards configured and running.


Prior to any cloud cost monitoring and cost optimization endeavor, there are three things you need to take care of first. One, you will require a Kubernetes cluster. Two, you need to configure the kubectl command-line tool so it can communicate with your cluster. And three, you need to install Git.

Know Kubernetes First and Foremost

Opsani can help you automatically unleash the full potential of your Kubernetes platform while keeping your costs to a manageable level. But before you perform any cloud cost monitoring and optimization, it is essential that you have a deep and solid understanding of what Kubernetes is all about.

There are many optimization actions that you can perform to reduce Kubernetes costs. But if you don’t have sufficient Kubernetes knowledge, discovering the best way to optimize your Kubernetes clusters manually to bring down spend can be an extensive exercise.  

For example, knowing when and if to tune your infrastructure with the cluster autoscaler or focus on tuning the application with the Horizontal Pod Autoscaler (HPA) or the Vertical Pod Autoscaler (VPA) already gives a complexity of options to choose from.  And, as there are others, the number of possible parameters to consider quickly grows to a point where improving efficiency and decreasing cost is quite possible, but knowing that your configuration is truly optimal is hard. The figure below shows various combinations of CPU and memory settings for a two-tier application tested over a five hour period as Opsani’s AI continues to hone in on an optimal configuration. 

Opsani leverages ML algorithms to provide continuous optimization for Kubernetes. What is challenging or impossible for a human, the Opsani AI handily finds and applies optimal configurations to the environment.  Further, Opsani continually refines its understanding of the optimum across time and through load variations.

Contact Opsani to know more about how they can help you optimize your infrastructure and cut your costs with the power of AI and ML. You can also sign up for a free trial and experience how Opsani can take your business to higher heights.

Artificial Intelligence and the Enterprise Stack


Artificial Intelligence and the Enterprise Stack

In case you were unable to attend, this month we are streaming Jocelyn Goldfein’s, Managing Director at Zetta Venture, presentation at our recent MeetUp. She discussed how the pace of change has been crazy for enterprises and how they now have to turn on a dime. How can they do this? By looking into AI because it can help solve problems to keep up with this rapidly evolving industry. Tune into last month’s webinar How Opsani Delivers Value to Enterprise SaaS Platforms to learn more about Opsani!

Request A Demo

CI/CD: What Does It Mean And When To use It?


What is CI/CD? What it means and when to use it.

You may have heard the stories of companies that have abandoned the ‘release cycle’ and push code to production multiple times a day.  This might be a new feature, a small update, or a bug fix. Companies like Google, Netflix and Amazon pushing code to production hundreds if not thousands of times a day.  While the cultural processes of DevOps and SRE (Site Reliability Engineering) play into why this is possible, Continuous Integration and Continuous Deployment (CI/CD) are what are driving the very possibility of rapidly and continuously improving code. 

If you’ve spent any time in the world of software development then you have certainly come across CI/CD.  While the CI, continuous integration, aspect is fairly well understood by software developers as a part of the development process, CD is more mysterious. This may be because it crosses over into the realm of “operations.” The fact that CD may refer to continuous delivery or continuous deployment also adds some confusion.  The image below provides an overview of a typical CI/CD workflow and you can see where the Dev-centric CI process hands off to the Ops-centric CD process.

Continuous integration 

Focused on automating the building of software with testing built into the process, CI is primarily the developer side of CI/CD equation. In the graphic above the flow is to create some initial code (or modify the existing, stored code), run tests on the code, and if the tests pass, update the codebase. This can cycle through multiple times. The stored code – aka “artifact” – could be manually uploaded somewhere to run, but in a world where automation is increasingly the goal, such as in the world of DevOps and SRE, both CD and CD accelerate the deployment process. There are many tools that making support setting up a CI pipeline. Jenkins, Travis, GitLab, CircleCI, and Azure Pipelines are examples.

Continuous Delivery  vs Continuous Deployment

Although these two terms are different, their differences really come at the end of the process, so we’ll come back to that in a minute. The overall CD process is much like the CI process, but now the idea is that testing is going on in the environment that includes any other interactions that will be needed for the final presentation of the code. Especially in cloud systems where API-driven automation the process looks much like the CI process. If all tests pass, the new code is assigned a version and….  Now we need to consider the difference between CD and … CD. Continuous delivery simply means that there is a manual step – a final check- that will release the code into the wild.  Continuous deployment automates this last step with the assumption (recognition) that if the code passes the CD stage tests it is good to go and release into the wild. Like CI, there are many solutions to help you get your CD on – Jenkins, GitLab and Azure Pipelines will be familiar from the CI service examples. Netflix’s open source Spinnaker is also growing in popularity.

When and Why Does CI/CD work?

It is possible to build a pipeline like the one in the diagram and cause a critical failure to occur. The reason that so many companies are confident that CI/CD is the way to go, even in the face of the potential for such failure is that those companies’ DevOps/SRE teams are doing things correctly.

  1. Push small code changes, frequently. This goes for CI and CD.  Version control (and branching dev processes) is certainly going to be part of how code is developed, but those changes should be small.  Many will recommend that whatever you are changing should be ready to push to the master branch at the end of the day. This makes sure that if that change does break something, it should be easy to find and fix.  It also means that what is being pushed (and released) is relevant and you don’t spend weeks on code that suddenly becomes irrelevant.
  2. Build tests. Let me say it again – build tests. Again, a CI and CD principle. Testing the code that is being pushed against automated builds is probably the one thing that gets overlooked in the rush to build something shiny. Not having tests comes with a barrel of regret when it is time to go back and manage when something deep in the code breaks far into the development process. Although strange things will happen, building tests to make sure that the new code behaves as advertised is a first principle of CI processes.
  3. Use both unit and integration tests. More popular on the CI side, but has some relevance to the CD side, especially if you follow the Infrastructure-as-Code model.  You are building tests for your code? They are passing? Good. Now you need to make sure to both build tests that make sure your fancy new feature does what it should, the unit test, and that it plays nice with the rest of the code base, integration test.  
  4. Fix the build before pushing new code. Applies to CI and CD processes. Hopefully this means that tests are failing rather than an actual critical system failure, but the idea here is that you don’t want to pile stuff that could or should work on a system that is broken. This could mean rolling back to a previous version in production, but the better principle is to version forward with a quick fix.  This is the point where you see the value of principles 1, 2, and 3 (you were following these, right?) as small changes should equal small, quick fixes.
  5. Automate all the things. Automation, with testing, will produce systems that are reliably reproducible and avoid the toil of repeating processes manually and potentially adding human error into the system each time. Although this does not imply a static system (see the next point) it does mean that when the code is pushed and passes its tests in the CI system, a functioning and tested CD system should push that code straight through to production with confidence.
  6. Continuously improve the process. Our code lives in a constantly changing environment and as business, security, infrastructure, and other demands change, your system is likely to need to respond.  Because CI/CD is a Dev & Ops process, there are always likely to be places where the process can be improved, simplified and automated further. Having communication across concerns, a core DevOps principle will certainly shine the light on where opportunities for improvement lie. 


Hopefully all this sounds like the way that applications should be built and deployed. This is increasingly a consensus view.  However, while the processes can be laid out and there are multiple tools to support CI and CD processes, it is important to remember that there is a mindset/cultural shift that is required. Going from a waterfall, big release every six or 12 months to pushing code to production multiple times every day does not happen overnight. It is not something that really works if it is partly-adopted.  Even within a single team, having a single engineer not following the necessary testing protocols can derail things.  As when considering adopting any new technology, it is good to start small, start building teams that get the process down, validate how things work for their specific situation and can share their knowledge.  From there it can expand to cover the rest of the company. The CI/CD journey is not one that ends, but rather one that improves, continuously. Contact Opsani to learn more about how our technology and products 

AWS EC2 Instances: Optimizing Your Application Performance


AWS EC2 Instances: Optimizing Your Application Performance

Amazon Web services provide an ever-increasing menu of instance types optimized to fit different use cases. AWS EC2 Instances provide different combinations of CPU, memory, storage, and networking capacity and allow you to match instance performance requirements to the application you plan to run on them. Many, but not all, instance types come in multiple sizes, again allowing you to match or scale the instance to your actual workload needs.

What are AWS EC2 Instance Families?

Let’s have a look at what the AWS “Instance Families” are all about.

General Purpose

These instances aim to provide a balance of CPU, memory, and networking resources that can serve a wide variety of typical applications. A typical web server would be an example of a general-purpose application.

Compute Optimized

These instances are designed for applications that are compute-intensive and would thus benefit from high-performance processors. Applications such as media transcoding, dedicated gaming servers, high-performance computing, and machine learning would all benefit from this type of instance.

Memory Optimized

These instances deliver fast performance for workloads that process large data sets in memory. A big data application that ingests and processes large amounts of real-time data is an example.

Accelerated Computing

These instances use hardware accelerators to perform functions, such as graphics processing or data pattern matching. They do this more efficiently than CPU-based processors target applications similar to those of Compute Optimized instances.

Storage Optimized

These instances work best for applications that require high rates of read-write access to large data sets in local storage. Certain big data applications would fall into this category.

Further Application Optimization With Instance Features

Within Instance Families, some Instance Types have additional features that further allow you to optimize how your application will perform.  Some instances can “burst” CPU performance to meet spikes in workload to prevent application performance impacts. Several storage options are available to balance performance, data durability, and cost. Elastic Block Storage optimization and cluster networking capabilities can further improve performance in applications that benefit from speedy access to data from storage or applications on other instances. Let’s take a closer look at some of these options.

Burstable Performance Instances

AWS has two general types of EC2 instances, Fixed Performance Instances, and Burstable Performance Instances. Because many apps (e.g. small databases, web servers) don’t use a lot of CPU all the time, burstable instances, designated as T instances, provide high CPU performance above a baseline only when needed. The benefit is that you end up effectively paying for smaller instance size, but get the CPU performance of a larger instance when load spikes and it is needed. These burstable instances are only efficient, however, with variable workloads. If your application process needs high CPU performance all the time (think HPC apps, graphics processing,…) a Fixed Performance Instance will provide better ROI.

Storage Options

Amazon provides several storage options that can interface with EC2 instances. Amazon Elastic Block Storage (EBS) is a durable storage volume that attaches to a single, running Amazon EC2 instance. An EBS volume behaves like a physical hard drive and persists independently from the life of an EC2 instance. It can be detached from one EC2 instance and attached to another with the data persisting.

There are three EBS volume types that can be used to balance the cost and performance of your application: two SSD types (General Purpose and Provisioned IOPS) and Magnetic. General Purpose is the default choice for most uses. Provisioned IOPS provides consistent and low-latency performance. These would be appropriate for use with very large databases. Magnetic volumes provide the lowest performance but also the lowest storage cost. If the stored data is accessed infrequently and the cost is a greater concern that performance, this may be an appropriate storage choice.

Another type of block storage is called “instance storage.”  An instance store is on disks that are physically attached to the host computer. This provides temporary storage as, unlike the persistent EBS, instance store data is deleted when the associated EC2 instance is deleted.

Object storage is available to EC2 instances through Amazon S3, a highly available and highly durable storage option. Data is stored as a “blob” or object and unlike EBS stored data cannot be altered in part. Objects in S3 storage need to be retrieved to allow any modification. The modified file can then be returned to S3 storage as a new object. 

EBS-optimized Instances

Some EC2 instances are EBS-optimized to fully use the IOPS provisioned on an EBS volume for an additional hourly fee. Several instance types are EBS-optimized by default  (M6g, M5, M4, C6g, C5, C4, R6g, P3, P2, G3, and D2) and do not incur an additional cost. The EBS-optimized instances are designed for use with all EBS volume types. Amazon recommends using Provisioned IOPS volumes with these types of instances to achieve the best performance. 

Cluster Networking

A small number of EC2 instances have the ability to support cluster networking when deployed in a cluster placement group. This will provide low-latency networking between all instances in the cluster. As long as instances are in the same region and depending on the specific capabilities of the instances, instances can utilize up to 5 Gbps for single-flow and up to 100 Gbps for multi-flow traffic in each direction. 

Measuring Instance Performance

Why measure instance performance?

With so many EC2 instance choices, it is possible to narrow down the ‘right’ instance or set of instance types for an application, but without measuring the actual performance of the application, it is not possible to be sure that you have chosen correctly. 

Amazon recommends measuring your application performance, even running applications on different instance types in parallel,  to make sure that you are using the right instance type. Testing the application’s ability to scale under load is also important to be sure that performance remains consistent and workloads vary in production environments.

Optimizing Amazon EC2 Performance 

While you could start out deploying an application on a General Purpose instance and haphazardly pick an instance size, it is unlikely that you will have chosen the best option for optimal application performance. The point of Amazon’s specific instance type families, instance size options, and added instance features is to make it easier to match your application’s performance needs to the particular capabilities provided by a specific instance type.  Matching your application to an appropriate EC2 instance type and size is a good first step in optimizing app performance.

To go to the next level, you will want, nay need, to monitor your application’s actual performance.  Even if you choose what seems to be the ‘right’ instance, it is possible that limitations in the software, infrastructure, or overall system architecture cause performance issues.  When performance issues are identified, it is then possible to reconsider instance choice and go through the testing process again to validate the new choice. There are many load testing and application profiling tools that can be used to gain the necessary insights.

While this iterative process of testing and adjusting instance type to optimize application performance is critical, it is also slow and laborious.  Further, given possible variations in application performance in relation to production loads and a large number of possible EC2 instance configurations available, achieving truly optimal performance becomes a daunting and potentially never-ending process.  Opsani ML/AI intelligently automates the optimization process and eliminates the toil involved.  In addition to being able to evaluate all available instance options against performance metrics, Opsani can integrate to push the recommended configuration back to your system and validate performance improvements from the changes.  Take continuous optimization with a free trial.

What are Opsani's On-boarding Requirements?


What Are Opsani's On-boarding Requirements and When Will You See Results?

In order for Opsani’s machine learning backend to work, we need to read the real time metrics. These metrics describe the desired performance goal and enabling us to make run time changes to those configurations or parameters. The Opsani servo is easily deployed into a customers environment. It’s also really safe. Opsani works across any kind of application structure. With our Kubernetes offering, we are easily integrated into our customers environment. We then see those performance metrics in the environment and do those configuration changes. Basically, we need to see real time metrics from the application and have the ability to configure workloads. With our machine learning, we are able to get our customers results within a day or less. More sophisticated application tuning does take around a week. Although through our machine learning, that time is decreasing. Check out our last video Black Magic: What Does It Mean and Why Is It So Important? for more Opsani knowledge! 

Request A Demo

Kubernetes: Nodes, Taints, and Tolerations Best Practices


Kubernetes: Nodes, Taints, and Tolerations Best Practices

What are Taints and Tolerations?

In Kubernetes, node taints and tolerations function in a manner similar to node affinity rules, though they take the almost opposite approach. Affinity rules are set for Pods to attract them to specific nodes. A tainted node repels pods that do not have tolerations for those nodes set. Together, taints and tolerations make sure that pods are not scheduled onto inappropriate nodes. A taint produces three possible effects: NoSchedule,  (Kubernetes will only schedule pods that tolerate the node’s taint), PreferNoSchedule (Kubernetes will avoid scheduling non-tolerant Pods on the Nodes, but may still do so), or NoExecute (Kubernetes will evict any running non-tolerant pods already running on a tainted node).

Why Use Taints and Tolerations?

Kubernetes taints and tolerations allow you to create special nodes that are reserved for specific uses or only run specific processes (Pods) that match the node. You may wish to keep workloads off or your Kubernetes management nodes and tainting nodes so that no workload Pod would have matching tolerations would keep them from being scheduled to those nodes.  You may have nodes with specialized hardware for specific jobs (e.g GPUs) and tainting such nodes can reserve it so that the Pods that specifically need that resource type can be scheduled to those nodes when needed.

Node Taints and Pod Tolerations

Applying Taints and Tolerations

Taints are applied to a node using kubectl, for example:

kubectl taint nodes machineLearningNode1=computer-vision:NoSchedule

You can then verify that this taint has been applied with the kubectl describe nodes machineLearningNode1 command and any applied taints, and there could be multiple, would be described in the Taints: section. In this example, any existing nodes would keep running on the tainted node, but no further Pods would be scheduled unless they have the following tolerations fields in their Podspec:

Since this toleration matches the tainted node, any pod with that spec could be deployed in the node machineLearningNode1. If you later wished to remove the taint on this node, the command kubectl taint nodes machineLearningNode1 computer-vision:NoSchedule untainted will remove it.

Using Multiple Taints

It is possible to apply more than one taint to a single node and more than one toleration to a single Pod. Multiple taints and tolerations are used by Kubernetes like a filter. Taint’s matching a Pod’s tolerations are ignored. The remaining taint effects are then applied to the Pod. Some of the effects include

  • Kubernetes will not schedule the Pod if at least one non-tolerated taint has a NoSchedule effect.
  • Kubernetes will try not to schedule the Pod on the node if at least one non-tolerated taint has a PreferNoSchedule effect.
  • A NoExecute taint will cause Kubernetes to evict the Pod if it is currently running on the node or will not schedule the Pod the node.

As an example, if you have a Node to which you’ve applied the following taints:

And you have a Pod with the following tolerations:

The Pod would not be scheduled to the node because it tolerates the first two taints but will be affected by the last, non-tolerated NoSchedule taint.  Now if the Pod was already running on the node when the last NoSchedule taint was added it would continue running on the pod.  In the case of the running Pod, if we then added also a NoExecute taint:

The running Pod would now be evicted from the Node.

You may have noticed another modifier in the toleration Podspec example, namely the operator value, that allows further modification of how Kubernetes evaluates the toleration against a taint.  The default value for operator is Equal so if the value for taint and toleration are indeed equal, the taint is tolerated. If the operator Exists, no value should be specified in the toleration. Two behaviors to be aware of are that an empty key with operator Exists will match all keys, values and effects and thus tolerate everything, and an empty effect matches all effects with key key.

AWS S3 Cloud Cost Tips and Best Practices


AWS S3 Cloud Cost Best Practices

AWS S3 cloud costs can get tricky as you need to understand different types of storage and operations that impact cloud storage costs. Determining what impacts these costs can provide users with insights into extra charges apart from the cost of storing digital objects and enhancing cloud efficiency.

What are AWS S3 Storage Classes? 

Enterprises need a large amount of data to function. Your enterprise will typically fall on to a continuum of how often that data is accessed. The range goes from one end where data is continually being accessed, altered, or deleted. At the other end, there is compliance, regulatory, or other archival data that will not be used for a large amount of time, months to years.

Amazon S3 provides six storage classes. They possess unique availability, performance requirements, and durability.

  • S3 Standard
  • S3 Intelligent-Tiering
  • S3 Standard Infrequent Access (IA)
  • S3 One Zone – Infrequent Access (S3 One Zone-IA)
  • S3 Glacier
  • S3 Glacier Deep Archive

S3 Standard

S3 Standard is usually the go-to option. This is because it is created so users can access data often. S3 is adjustable and supportive due to its low latency and high throughput.

S3 Intelligent Tiering

Intelligent Tiering operates through monitoring and automation abilities to optimize the transfer of data between a frequent-access (FA) and an infrequent-access (IA) model. The purpose of S3 intelligent tiering monitors your application so you do not pay for FA data that is not being used. You will receive anticipated costs if your data access methods change. You can choose between monthly monitoring and auto-tiering fees, with no data retrieval fees. 

S3 Standard- Infrequent Access (IA)

S3 Standard-IA is great for data storage that is accessed less frequently compared to data in S3 Standard, although still needs fast access when required. It’s fitting for the long-term storage of backups. It typically can be used for data storage for emergency recovery. It is cheaper  than S3 Standard, but still has data retrieval expenses.

S3 One Zone- Infrequent Access (S3 One Zone- IA)

The S3 One Zone – Infrequent Access (S3 One Zone- IA) stores data in a single AWS Availability Zone (AZ). It is different from the other S3 classes because it was created to be adaptable to the tangible loss of an AZ, this can be caused by a hurricane, earthquake, etc. This would be perfect for you if you don’t require extra protection due to your geographical location.  This S3 class is 20% cheaper than S3 Standard-IA.

S3 Glacier

This S3 purpose is long-term, rarely accessed data. Typically needed for end-of-lifecycle data that are not able to be deleted due to compliance and regulatory requirements. You can reacquire your data at different speeds.

S3 Glacier Deep Archive

Glacier Deep Archive is the cheapest S3, it’s purpose is long-term retention and digital maintenance of data that is not regularly accessed. It is most popular for highly-monitored industries that have to preserve data sets for supervisory compliance. 

Amazon S3 Storage class characteristics

AWS S3 Basic Costs

Amazon S3 Data Storage Costs

S3 data storage costs are different depending on the region they are located in. The table below demonstrates the costs of the S3 classes in the US West (Northern California) region:

The main aspect of provisioning storage costs is to match the correct data with the correct storage class with how the data is utilized. The best way to do this is to examine your data and determine how frequently accessed to determine if you need S3 Standard. Then, if you don’t need S3 Standard, you choose between S3 Standard-IA or One Zone-IA and Glacier. If your data can easily be renewed or is repetitive, the best choice for you could be One Zone-IA. Glacier storage is a great option if you are looking for long-term storage. The best way to determine your decision is by looking at how you use your data currently. If you need help governing your data’s organization within IA storage or Standard, S3 Intelligent-Tiering could be for you.

Amazon S3 Data Relocation Expenses

AWS data relocation expenses are determined by how much data needs to be relocated from S3 to the Internet (Data Transfer Out), in addition to the data being relocated between AWS regions (Inter-Region Data Transfer Out). Moving data between S3 services in alike regions have no cost. The cost for internet data relocations look like this In the Us West (Northern California) region:

Relocating data to different regions comes with expenses, but they are much cheaper than the internet. If you need to relocate the data to CloudFront, there is no cost.

Amazon Transfer Acceleration 

Amazon S3 Transfer Acceleration allows the quicker transfer of files over far ranges between your user and your Amazon S3 bucket. It’s simple to use and there is no customization. AWS intelligent routing enables quicker data transfers. Transfer Acceleration has a control that if AWS does not move the data packet quicker, you don’t have to pay for premium. You can preview the effects of data transfer acceleration on AWS’ Speed Comparison page.

Transfer Acceleration pricing is added to data transfer fees and does is not specific to a region

Transfer acceleration operates with Edge Locations that are a piece of CloudFront. Your cost depends on the Edge Location you chose. 

AWS Snowball 

This S3 is a petabyte-scale data storage tool that operates with devices created to transfer massive quantities of data in and out of the cloud. To do this, construct a job in the AWS Management Console, and a Snowball tool will then be sent to you. Using Snowball devices for every job has no cost for 10 days of onsite utilization. Transferring data into AWS has no cost, however, the price transferring data out is determined by region. Also, you need to take into consideration the job and shipping fees. 

AWS S3 Query in Place Pricing

Amazon S3 enables users to examine and handle massive quantities of data in the cloud. Query in Place disposes of the demand to transfer data out, examine, and handle data. You are also able to transfer it back to S3. To do this you can work with S3 Select, Amazon Athena, and Amazon Redshift Spectrum. 

Amazon S3 Select Request Expenses

Expenses for S3 Select is based on request. Requests are Put, Copy, Post, List, Get, and Select. Expenses will differ depending on which region you are in.

Amazon Redshift Spectrum S3 Expenses

This S3 is created to operate with SQL to perform queries opposite to exabytes of data. Similarly to Amazona Athena, your spending is based on the quantity of data scanned. It costs $5 per TB scanned. Besides, you can manage your costs by condensing your data and changing it to columnar formats like Athena.  

Since Redshift Spectrum and Athena are comparable as they share similar functions. You need to determine if you would use Redshift as a data warehouse. 

AWS S3 Storage Management 

AWS gives users multiple services to provision your storage. They consist of AWS Object Lifecycle Management, inventory, analytics, and object tagging.

Take Advantage of Lifecycle Management 

This tool enables users to set up data disposal or migration from types of S3 storage to reduce storage expenses in the long run. 

Object Lifecycle movements are arranged into Transaction Actions and Expiration Actions.   There are no expenses for Expiration Actions, but there are expenses for Transaction actions.

How to handle Glacier Requests with Lifecycle Managements

Life Cycle management can be used to inform Amazon S3 to transfer objects to a different S3 class. If you need to do this constantly you will need more storage to store metadata. You need to keep in mind the fees for data retrieval and the speed of retrieval. 


Amazon S3 inventory gives files of your metadata and objects. The cost depends on the number of objects listed. The cost is $0.0025 per million objects listed.

S3 Analytics Storage Class Examination

This S3 service allows you to provision how frequently you access objects in your S3 service to transport smaller frequently-access storage to a storage class that costs less. 

S3 Analytics costs $0.10 per million objects provisioned per month.

S3 Object Tagging

This S3 allows you to provide appropriate access to S3 objects. S3 object tags are applied to S3 objects and they can be designed, updated, or removed at any point in the lifetime of an object.

S3 Object Tagging costs are based on the number of tags at $0.01 per 10,000 tags per month.

AWS Services paired with S3

If you need more choices for transferring data Amazon has a few other services. These services have different costs that are intertwined with other Amazon services.

AWS Direct Connect 

This S3 tool incorporates a strong network to access direct ports in AWS data centers at impressive velocities. Direct Connect can be consumed by EC2, DynamoDB, VPC, and S3. The expense for direct connect is the cost per port per hour and how much data is moved.  

Amazon Kinesis Data Firehose

Amazon Kinesis Data Firehose supplies streaming data into data stores. The cost comes from the amount of data consumed. This is determined by the number of data records you ship to the service multiplied by the capacity of each record rounded up to the nearest 5KB. 

Managing your AWS S3 spend

Amazon S3 is a complex and intricate system. Understanding the use cases of the different storage types allow you to map your storage size and access needs to the correct S3 offering.  You need to be in control of your services to avoid incurring unnecessary expenses. Going through the process of cloud cost optimization is essential to determine what the best cloud storage management plan is for you.

Kubernetes Node Affinities Best Practices


Kubernetes Node Affinities Best Practices

One of Kubernetes’ primary functions is deciding the “best” place in a cluster to deploy a new pod. While the default settings are going to be functional, they may very well not be optimal in regards to how your application performs or how efficiently your infrastructure is being utilized. If the scheduler is faced with application configurations that drive poor resource use, best may, in fact, not be all that good. One of the great things about Kubernetes is the many ways that you can optimize your system’s performance. How the scheduler decides to place pods can be tuned using node taints and toleration along with node or pod affinities. In this article, we’ll take a closer look at affinities.

Node Affinity

Depending on how your cluster is configured, not all nodes will have similar resource (hardware) capabilities to start with. Also, in a running cluster, nodes will have different resources available as pods are deployed to them or deleted. In most cases, the Kubernetes scheduler does a great job of making sure the best node is selected by checking the node’s available capacity for CPU and RAM and comparing it to the Pod’s resource requirements if these have been defined through limit and request settings. 

However, if you have an application that runs CPU or memory-intensive operations, think Big Data, Machine Learning, or graphics processing apps, and you have included nodes that will best support those applications, you’d probably like Kubernetes to schedule the pods running those apps on the appropriate nodes. Setting a node affinity in your Podspec lets you specify this. For example, you might have instances with high frequency and CPU core count labeled as cpuType=xenone5. To assure that Kubernetes schedules your CPU-intensive pods only on those nodes, you would use the nodeSelector with the xenone5 label in the spec section: 



cpuType: xenone5


An even more fine-grained way to control pod to node placement is the use of node affinity/anti-affinity in the affinity field in the spec section. There are primary options here:



The difference between these two instructions is that the required… specification is an absolute and preference… is not. In the first case, the scheduler decides it either can or cannot deploy a pod, in the other, the scheduler will attempt deployment to the specified node type if available or it will schedule deployment on the next available node that meets resource request and limit settings. These two affinity labels can be further modified with a selection of syntactical commands (e.g. NotIn, Exists, DoesNotExist, etc.).

Pod Affinity

Pod affinity functions very much like node affinity required…/preferred… options. In this case, the intent is to place pods near each other or keep them separated. Affinity might be useful to keep two microservices that communicate frequently near each other to reduce latency (e.g. an ordering system server and the order and customer backend databases). Anti-affinity might be a way to keep resources, such as the replicas of a distributed database, on different nodes to improve reliability or assure availability.

What is Prometheus and Why Should You Use It?


What is Prometheus and Why Should You Use It?

Prometheus is one of many open-source projects managed by the Cloud Native Computing Foundation (CNCF). It is monitoring software that integrates with a wide range of systems natively or through the use of plugins. It is considered the default monitoring solution for the popular Kubernetes container orchestration engine, another CNCF hosted project.

Prometheus can collect metrics about your application and infrastructure.  Metrics are small concise descriptions of an event: date, time, and a descriptive value. While prometheus does store or ‘log’ metrics, metrics should not be confused with logs, which can include reams of data.  Rather than gathering a great deal of data about one thing, Prometheus uses the approach of gathering a little bit of data about many things to help you understand the state and trajectory of your system. It has become very popular in the industry because it has many powerful features for monitoring metrics and providing alerts that can, with orchestration systems (e.g. Kubernetes), automate responses to changing conditions.

Prometheus Architecture

It is useful to have an understanding of the pieces that make up Prometheus prior to talking about how it works. 

The Prometheus is a  multi-component system. While the following integrate into a Prometheus deployment, there is flexibility in which of these pieces are actually implemented.

  • Prometheus server (scrapes and stores metrics as time series data)
  • client libraries for instrumenting application code
  • push gateway (supports metrics collection from short-lived jobs)
  • special-purpose exporters (Supports tools like HAProxy, StatsD, Graphite, etc.)
  • alertmanager ( sends alerts based on triggers)
  • additional support tools

Prometheus can scrape metrics from jobs directly or, for short-lived jobs by using a push gateway when the job exits. The scraped samples are stored locally and rules are applied to the data to aggregate and generate new time series from existing data or generate alerts based on user-defined triggers. While Prometheus comes with a functional Web dashboard or other API consumers can be used to visualize the collected data, with Grafana being the de facto default.

How Prometheus works 

Prometheus gets metric from an exposed HTTP endpoint.  A number of client libraries are available to provide this application integration when building software. With an available endpoint, Prometheus can scrape numerical data and store it as a time series in a local time series database. It can also integrate with remote storage options.

In addition to the stored time series, impermanent times series from the source are produced by queries. These series are recognized by metric name and key value pairs known by labels. Queries are generated using PromQL (Prometheus Query Language) that enables users to choose and aggregate time series data in real time. PromQL is also used to establish alert conditions that can then transmit notifications outside sources such as PagerDuty, Slack or email.  These data can be displayed in graph or tabular form in Prometheus’s Web UI.  Alternatively, and commonly, API integrations with alternative display solutions such as Grafana may be used.

When is Prometheus the Correct Monitoring Solution?

Prometheus’ primary focus is on reliability rather than accuracy. For this reason, it is ideal in highly dynamic systems such as microservices running in a cloud environment. It is probably not a good fit for a system that requires high accuracy, such as a billing application. In this case the specific billing function should be addressed with an alternative, but Prometheus may still be the right tool for monitoring the other application and infrastructure functions.

The focus on reliability is built in by making each Prometheus server standalone with local time series database storage to avoid reliance on any remote service. This design makes Prometheus an ideal tool to rely on for rapidly identifying issues and getting real-time feedback on system performance. 

Opsani provides seamless integration with tools like Prometheus and Kubernetes to inform and automate its AI-driven performance optimization. The end results include benefits such as increased productivity, more stable applications, and more agile development processes. Contact Opsani to learn more about how our technology and products can put your metrics to work for you, automatically.