We would like to introduce you to Peter Nickolov, our CTO and co-founder, who has developed a new open-source tool to identify Kubernetes applications that are deployed with inefficient configurations or present reliability risks. When moving applications to production, it’s critical that best practices are observed for operational success. We released this tool as open source in order to expose the analysis modeling and to allow the community to contribute with further insights.

You can download the source for Ignite at https://github.com/opsani/opsani-ignite, or grab a ready-to-run binary for macOS, Windows or Linux. The repository also contains the full source code, in case you want to contribute to the project or just build it yourself.

To gain a deeper understanding of how the Ignite tool works, its operation can be broken into three phases: discovery, analysis, and recommendation.

Phase 1: Discovery

On startup, Ignite discovers all applications running on the Kubernetes cluster. By tying into your Prometheus monitoring system, Ignite looks up all non-system namespaces and the deployment workloads running in them to obtain their key settings and metrics. By default, Ignite looks at the last 7 days of metrics for each application to capture most daily and weekly load and performance variations.

Phase 2: Analysis 

Ignite analyzes each application looking at the technical makeup of the components to uncover specific omissions of best practices for reliable production deployments. It looks at important characteristics such as the pod quality of service (QoS), replica count, resource allocation, usage, limits, and processed load. It identifies areas that require attention that are either causing or can cause performance and reliability issues.

In addition, Ignite determines whether the application is overprovisioned resulting in a higher than necessary cloud spend. In such cases, it also estimates the likely savings that can be obtained through optimization.

Phase 3: Recommendation 

Ignite produces a set of actionable recommendations for improving the efficiency, performance, and reliability of the application. The recommendations fall into several categories, including production best practices (for example, setting resource requests and limits), as well as optimal and resilient operation optimization recommendations. Applying these recommendations results in improved performance and efficiency, as well as increased resilience of their applications under load.

Report Format Flexibility

Ignite displays a list of applications, their analysis, and recommendations in a convenient, text-based interactive interface. This allows examination of each application, the identified risks, and recommended actions, alongside additional useful characteristics.

Ignite can display its report in a simple table format or as detailed text. There is also the capability to incorporate the application into an analysis pipeline with machine readable output which will be covered in our next blog on Opsani Ignite. 

Example Output

To illustrate the operation of Opsani Ignite, we ran it against a Kubernetes cluster running the Bank of Anthos microservice application. This application is a sample HTTP-based web app that simulates a bank’s payment processing network and is a fairly typical Kubernetes workload. Opsani has extended the application to add HPA-based autoscaling and to run it under a heavier variable load profile that is more consistent with what applications experience in production environments.

Figure 1 below shows the list of applications collected from the Kubernetes cluster.


Alongside the application’s namespace and deployment name, Ignite shows a summary of the analysis, including the application’s resource efficiency and performance/reliability risk that it represents. Pressing Enter on any row shows a popup box with more details and with recommendations for improving the application.

Figure 2 below shows the details of the frontend microservice


The Bank of Anthos’s frontend microservice shows very low resource utilization and can be optimized to rightsize its resources, potentially resulting in over 2x cost reduction. 

Ignite highlights that the application is configured in the “burstable” quality of service class, which has an increased risk of performance degradation under load and a higher probability of the application’s pods being evicted. For mission-critical applications, Kubernetes provides a “guaranteed” quality of service class that ensures that the requested application resources are available to it when required.

Figure 3 below shows an application with a missing resource definition kubernetes

This application, part of the Prometheus tools suite, is missing essential resource definitions resulting in it being placed in the “best effort” quality of service class. This class is unsuitable for most production applications. Even if the application appears to be working OK at a particular moment, it still has a very high risk of failing when additional resources are consumed by neighboring applications on the cluster.

Ignite identifies this increased risk and recommends setting resource requests and limits to appropriate values to achieve efficient resource usage.

Optimization Recommendations

Opsani Ignite provides analysis and a number of additional recommendations to improve performance, reliability and efficiency. 

Some of the best practices require correctly setting resource requirements in a way that meets the performance and reliability requirements of an application (e.g., latency and error rate service level objectives), while using assigned resources efficiently to control cloud costs. These values can be discovered manually, often through an onerous and repetitive manual tuning process. They can also be automatically identified using optimization services, such as the Opsani optimization-as-a-service tool. Those who are interested in how continuous optimization can remediate these issues can go to the Opsani website, set up a free trial account and attach the optimizer to their application.