[ARTICLE]

Instrumenting Kubernetes with Envoy for Application Performance Metrics

Opsani COaaS (Continuous Optimization as a Service) optimizes runtime settings such as CPU, memory, and autoscaling as well as in application settings such as Java garbage collection time and database commit times.  Opsani performs this optimization by learning from application performance metrics (APM).

Envoy (https://www.envoyproxy.io/) is a self-contained layer 7 proxy process that is designed to run alongside an application server.  One of its proxy functions is to provide performance metrics.  In Kubernetes, Envoy allows you to instrument applications to obtain performance metrics without changing application code or disrupting your application in production.

While there are a variety of methods and tools for application performance metrics, in this step-by-step guide, we’ll walkthrough instrumenting your Kubernetes application for performance metrics with Envoy.  For this exercise, we’ll assume you have access to a fresh Kubernetes cluster (e.g. AWS EKS) and for simplicity we’ll be working in the default Kubernetes namespace.  Note: Opsani has packaged Envoy to include configurations to support Opsani application optimization.  The source code is publicly available and documented in GitHub at  https://github.com/opsani/envoy-proxy.

Deploy an Application to Monitor and Optimize

While Opsani can optimize applications throughout a variety of operating systems, clouds, programming languages, and continuous deployment platforms, we’ll use a very simple Kubernetes example.  Any server application will generally suffice, but for learning about Opsani application optimization, it’s helpful to be able to control the resources that matter, including CPU, memory, and response times. 

fiber-http (https://github.com/opsani/fiber-http) is an open source Opsani tool that lets you do just that.  fiber-http is a webserver with endpoints to control CPU and memory consumption as well as server response times.  With these controls we can simulate a loaded server in a simple, controlled manner.

For this exercise, since fiber-http is already in DockerHub, let’s create a minimal Kubernetes yaml manifest file to stand up a Kubernetes Deployment of a fiber-http container and a load balancer service for ingress traffic.  Note: If you are not 100% comfortable with editing yaml files, we suggest using an editor that will help with lining up columns of text.  VSCode is a good editor for that (https://code.visualstudio.com/).

fiber-http-deployment.yaml:


apiVersion: apps/v1
kind: Deployment
metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
spec:
 replicas: 1
 selector:
   matchLabels:
     app.kubernetes.io/name: fiber-http
 template:
   metadata:
     labels:
       app.kubernetes.io/name: fiber-http
   spec:
     containers:
     - name: fiber-http
       image: opsani/fiber-http:latest
       env:
       - name: HTTP_PORT
         value: "0.0.0.0:8480"
       ports:
       - containerPort: 8480
       resources:
         limits:
           cpu: "1"
           memory: "1Gi"
         requests:
           cpu: "1"
           memory: "1Gi"
---

apiVersion: v1
kind: Service

metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
 #annotations:
 #  service.beta.kubernetes.io/aws-load-balancer-internal: "true"

spec:
 type: LoadBalancer
 #externalTrafficPolicy: Cluster
 #sessionAffinity: None
 selector:
   app.kubernetes.io/name: fiber-http
 ports:
 - name: http
   protocol: TCP
   port: 80
   targetPort: 8480

Run this manifest in Kubernetes via:

% kubectl apply -f fiber-http-deployment.yaml

This results in a pod with a single fiber-http container, with inbound traffic brought in by the LoadBalancer Service.

fiber-http (without Envoy) inbound traffic flow:

You can start HTTP communications through the service to the pod via a web browser, but for testing and automation purposes, let’s use the curl command line tool.

First, obtain the address of the service.

% kubectl get service
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
fiber-http   LoadBalancer   10.100.125.89   a0564378f112548d5b11cbc806d5f34e-1268639300.us-west-2.elb.amazonaws.com   80:31961/TCP   25h
kubernetes   ClusterIP      10.100.0.1      <none> 
                                                                   443/TCP        6d

Use curl to start an HTTP connection to the application.

% curl a0564378f112548d5b11cbc806d5f34e-1268639300.us-west-2.elb.amazonaws.com

move along, nothing to see here% 

Refer to the fiber-http GitHib repository for instructions on how to communicate with fiber-http to control CPU load, memory consumption, and HTTP response times.  

Instrumenting a Kubernetes Deployment with Envoy

Now it’s time to re-deploy the application with metrics instrumentation, towards the goal of autonomous optimization! 

We’ll insert Envoy as a proxy in front of the fiber-http application pod.

We’ll need to insert Envoy between the Service and the fiber-http application container.  Let’s copy the original yaml into a new file so that it’s easy to compare “before” and “after” configurations.

fiber-http-envoy-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
spec:
 replicas: 1
 selector:
   matchLabels:
     app.kubernetes.io/name: fiber-http
 template:
   metadata:
     labels:
       app.kubernetes.io/name: fiber-http
       # *** ADD FOR OPSANI ***
       # Attach a label for identifying Pods that have been augmented with
       # an Opsani Envoy sidecar.
       sidecar.opsani.com/type: "envoy"
     annotations:
       # *** ADD FOR OPSANI ***
       # These annotations are scraped by the Prometheus sidecar
       # running alongside the servo Pod. The port must match the
       # `METRICS_PORT` defined in the Envoy container definition
       # below. The metrics are provided by the Envoy administration
       # module. It should not be necessary to change the path or port
       # unless the proxied service happens to have a namespace collision.
       # Any divergence from the defaults will require corresponding
       # changes to the container ports, service definition, and/or the
       # Envoy proxy configuration file.
       prometheus.opsani.com/scrape: "true"
       prometheus.opsani.com/scheme: http
       prometheus.opsani.com/path: /stats/prometheus
       prometheus.opsani.com/port: "9901"
   spec:
     containers:
     - name: fiber-http
       image: opsani/fiber-http:latest
       env:
       - name: HTTP_PORT
         value: "0.0.0.0:8480"
       ports:
       - containerPort: 8480
       resources:
         limits:
           cpu: "1"
           memory: "1Gi"
         requests:
           cpu: "1"
           memory: "1Gi"
     # *** ADD FOR OPSANI ***
     # Opsani Envoy Sidecar
     # Provides metrics for consumption by the Opsani Servo
     - name: envoy
       image: opsani/envoy-proxy:latest
       resources:
           requests:
             cpu: 125m
             memory: 128Mi
           limits:
             cpu: 250m
             memory: 256Mi
       env:
       # The container port of Pods in the target Deployment responsible for
       # handling requests. This port is equal to the original port value of
       # the Kubernetes Service prior to injection of the Envoy sidecar. This
       # port is the destination for inbound traffic that Envoy will proxy from
       # the `OPSANI_ENVOY_PROXY_SERVICE_PORT` value configured above.
       - name: OPSANI_ENVOY_PROXIED_CONTAINER_PORT
         value: "8480"


       # Uncomment if the upstream is serving TLS traffic
       # - name: OPSANI_ENVOY_PROXIED_CONTAINER_TLS_ENABLED
       #   value: "true"


       # The ingress port accepting traffic from the Kubernetes Service destined
       # for Pods that are part of the target Deployment (Default: 9980).
       # The Envoy proxy listens on this port and reverse proxies traffic back
       # to `OPSANI_ENVOY_PROXIED_CONTAINER_PORT` for handling. This port must
       # be equal to the newly assigned port in the updated Kubernetes Service
       # and must be configured in the `ports` section below.
       - name: OPSANI_ENVOY_PROXY_SERVICE_PORT
         value: "9980"


       # The port that exposes the metrics produced by Envoy while it proxies
       # traffic (Default: 9901). The corresponding entry in the `ports` stanza
       # below must match the value configured here.
       - name: OPSANI_ENVOY_PROXY_METRICS_PORT
         value: "9901"

       ports:
       # Traffic ingress from the Service endpoint. Must match the
       # `OPSANI_ENVOY_PROXY_SERVICE_PORT` env above and the `targetPort` of
       # the Service routing traffic into the Pod.
       - containerPort: 9980
         name: service


       # Metrics port exposed by the Envoy proxy that will be scraped by the
       # Prometheus sidecar running alongside the Servo. Must match the
       # `OPSANI_ENVOY_PROXY_METRICS_PORT` env and `prometheus.opsani.com/port`
       # annotation entries above.
       - containerPort: 9901
         name: metrics
---

apiVersion: v1
kind: Service

metadata:
 name: fiber-http
 labels:
   app.kubernetes.io/name: fiber-http
 #annotations:
 #  service.beta.kubernetes.io/aws-load-balancer-internal: "true"

spec:
 type: LoadBalancer
 #externalTrafficPolicy: Cluster
 #sessionAffinity: None
 selector:
   app.kubernetes.io/name: fiber-http

 ports:
 # Send ingress traffic from the service to Envoy listening on port 9980.
 # Envoy will reverse proxy back to localhost:8480 for the real service
 # to handle the request. Must match `OPSANI_ENVOY_PROXY_SERVICE_PORT` above
 # and be exposed as a `containerPort`.
 - name: http
   protocol: TCP
   port: 80
   targetPort: 9980

You can use kubectl to apply these changes – even to a live running application.

Example:

% kubectl apply -f fiber-http-envoy-deployment.yaml
deployment.apps/fiber-http configured
service/fiber-http configured

We’ve “shimmed in” the Envoy proxy just in front of our application.  

fiber-http (WITH Envoy) inbound traffic flow:

Kubernetes

Verifying Envoy is gathering metrics from your container

You can scrape envoy via a curl by instantiating and accessing a shell in a Linux/busybox pod in the same namespace, and performing an http client command to pull metrics from Envoy.  But there’s a better way (see below).

Example:

(outside a Linux/busybox container in the same namespace):

% kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
fiber-http-6ccc567bf8-4psqz   2/2     Running   0          26h
% kubectl describe pod fiber-http-6ccc567bf8-4psqz

(obtain the IP address of the pod via “kubectl describe pod <fiber-http pod name>”, then shell into a Linux/busybox container in the same k8s namespace):

% kubectl run -i --tty --image=busybox --restart=Never -- sh
If you don't see a command prompt, try pressing enter.
/ # wget -qO- http://192.168.15.160:9901/stats
cluster.opsani_proxied_container.Enable upstream TLS with SNI validation.total_match_count: 0
cluster.opsani_proxied_container.Enable upstream TLS with validation.total_match_count: 0
cluster.opsani_proxied_container.Enable upstream TLS.total_match_count: 0
...
[more metrics follow] 

Kubernetes port-forward is a powerful test tool for communications debugging

Instead of creating a linux container to access Envoy, a less system-intrusive method is via the Kubernetes port-forward functionality.  Let’s port-forward TCP port 9901 on your machine running kubectl, to port 9901 in the pod, which is the listening port for the Envoy administration interface.

Syntax:

kubectl port-forward pod/{pod-name-of-an-injected-pod} local-port:destination-port

Example:

% kubectl port-forward pod/fiber-http-6ccc567bf8-4psqz 9901:9901
Forwarding from 127.0.0.1:9901 -> 9901
Forwarding from [::1]:9901 -> 9901

(This will continue to run until you exit via “control C”)

Now instead of running a local container to access Envoy, we can access from our kubectl machine.

% curl http://localhost:9901/stats/prometheus
# TYPE envoy_listener_manager_listener_modified counter
envoy_listener_manager_listener_modified{} 0
# TYPE envoy_listener_manager_listener_removed counter
envoy_listener_manager_listener_removed{} 0
# TYPE envoy_listener_manager_listener_stopped counter
...

Look for Metrics that Matter to your Application Performance

Envoy gathers many metrics about web server and application performance.  You can use either of the above methods to dump metrics while running load against the test application (fiber-http in this tutorial).  Here are some notable sample metrics from Envoy for application performance:

  • http.ingress_http.downstream_cx_total: 722  – This is the total number of client to server connections observed by Envoy since the last flush.
  • http.ingress_http.downstream_cx_length_ms: P0(1.0,1.0) P25(1.025,1.02601) P50(1.05,1.05203) P75(1.075,1.07804) P90(1.09,1.09365) P95(1.095,1.09886) P99(1.099,3.09743) P99.5(1.0995,5.041) P99.9(1.0999,432.82) P100(1.1,440.0)
    • Each P quantile entry shows the (interval, amount) of the length in ms of the connection.
    • This sample output was obtained with a simple shell while loop to fiber-http with no parameters
      • while true; do curl <k8s service>; sleep 1; done
    • fiber-http can simulate CPU load, memory, and response times by specifying URL parameters
      • while true; do curl <k8s service>/time?duration=800ms; sleep 1; done
      • Sample output with 800ms duration: http.ingress_http.downstream_cx_length_ms: P0(800.0,1.0) P25(802.5,1.02646) P50(805.0,1.05292) P75(807.5,1.07938) P90(809.0,1.09525) P95(809.5,2.02464) P99(809.9,805.493) P99.5(809.95,807.817) P99.9(809.99,809.676) P100(810.0,1100.0)

Visit https://www.envoyproxy.io/ for a detailed description of the various Envoy metrics and processes. 

Congratulations!  You’ve instrumented a simple Kubernetes application for Envoy metrics.  With these metrics, we can understand how our application is performing under load.  In our next exercise, you’ll utilize Envoy metrics with Opsani to optimize CPU and memory limits for the best application performance, at the lowest cost.

Once you’ve become familiar with Envoy, it’s time to start considering using another tool, Prometheus, to help manage and aggregate Envoy data across multiple services and multiple instances of your services.  For an introduction to Prometheus, check out our post on What is Prometheus and Why Should You Use it?