Kubernetes autoscaling was supposed to be easy. Even though one of the selling points of Kubernetes is scaling, the built-in autoscaling support is basic at best. You can only scale based on CPU or memory consumption, anything more advanced requires additional tooling that is often not trivial. The Gimlet.io team put together this blog to show common usecases of autoscaling: based on CPU custom Prometheus metrics and RabbitMQ queue length Furthermore, we are aiming to clear up the differences between the Horizontal Pod Autoscaler (HPA), the Prometheus Adapter and KEDA. Let's get into it shall we? First, about the Horizontal Pod Autoscaler (HPA). First, about the Horizontal Pod Autoscaler (HPA) The Horizontal Pod Autoscaler, or HPA in short, is a Kubernetes resource that allows you to scale your application based on resource utilization such as CPU and memory. To be more precise, HPA is a general purpose autoscaler, but by default only CPU and memory metrics are available for it to scale on. Its data source is the Kubernetes Metrics API, which by the way also powers the command, and backed by data provided by the component. This component runs on your cluster and it is installed by default on GKE, AKS, CIVO and k3s clusters, but it needs to be manually installed on many others, like on Digital Ocean, EKS and Linode. kubectl top metrics-server The HPA resource is moderately well documented in the Kubernetes documentation. Some confusion arises from the fact that there are blog posts out there showcasing different Kubernetes API versions: keep in mind that is not backwards compatible with v1! autoscaling/v2 More headaches arise when you try to scale on resource metrics other than CPU and memory. In order to scale pods - let's say - based on number of HTTP requests or queue length, you need to make the Kubernetes API aware of these metrics first. Luckily there are open-source metrics backends implemented, and the best known is Prometheus Adapter. Prometheus Adapter Prometheus Adapter is a Kubernetes Custom Metrics API implementation which exposes selected Prometheus metrics through the Kubernetes API for the Horizontal Pod Autoscaler (HPA) to scale on. Essentially you configure the Prometheus Adapter to read your desired metric from Prometheus, and it will serve it to HPA to scale on. This can be an HTTP request rate, or a RabbitMQ queue length or any metric from Prometheus. Prometheus Adapter does the job, but in our experience its configuration is cryptic. While there are several blog posts out there explaining its configuration syntax, we could not make it work sufficiently reliably with our custom metrics scaling needs. That is essentially why we have brought you here today, to share our experience with a Prometheus Adapter alternative, called KEDA. So, what exactly is KEDA, and why we prefer it? KEDA KEDA is a Kubernetes operator that is handling a user friendly custom yaml resource where you can define your scaling needs. In KEDA, you create a custom resource with the necessary information about the deployment you want to scale, then define the trigger event, which can be based on CPU and memory usage or on custom metrics. It has premade triggers for most anything that you may want to scale on, with a yaml structure that we think the Kubernetes API could have been made in the first place. ScaledObject KEDA does two things: it exposes the selected metrics to the Kubernetes Custom Metrics API - just like Prometheus Adapter and it creates the Horizontal Pod Autoscaler resource. Ultimately this HPA does the scaling. Now that you have an overview, let's take a step further and show how you can autoscale with KEDA! Autoscaling example based on CPU usage In order to autoscale your application with KEDA, you need to define a resource. ScaledObject apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cpu-based-scaledobject
  namespace: default
spec:
  minReplicaCount:  1                                 
  maxReplicaCount:  10
  scaleTargetRef:
    kind: Deployment
    name: test-app-deployment
  triggers:
  - type: cpu
    metricType: Utilization
    metadata:
      value: "50" is where you refer to your deployment, and is where you define the metrics and threshold that will trigger the scaling. scaleTargetRef triggers In this sample we trigger based on the CPU usage, the will manage the number of replicas automatically for you and maintain a maximum 50% CPU usage per pod. ScaledObject As usual with Kubernetes custom resources, you can and the resource once you deployed it on the cluster. kubectl get kubectl describe $ kubectl get scaledobject
NAME                    SCALETARGETKIND      SCALETARGETNAME      MIN   MAX   TRIGGERS  READY   ACTIVE
cpu-based-scaledobject  apps/v1.Deployment   test-app-deployment   2     10    cpu      True    True To have an in-depth understanding of what is happening in the background, you can see the logs of the keda operator pod, and you can also the HPA resource that KEDA created. kubectl describe Autoscaling example based on custom metrics To use custom metrics, you need to make changes to the section. triggers Scaling example based on custom Prometheus metrics: triggers:
- type: prometheus
  metadata:
    serverAddress: http://<prometheus-host>:9090
    metricName: http_requests_total # Note: name to identify the metric, generated value would be `prometheus-http_requests_total`
    query: sum(rate(http_requests_total{deployment="my-deployment"}[2m])) # Note: query must return a vector/scalar single element response
    threshold: '100.50'
    activationThreshold: '5.5' Scaling example based on RabbitMQ queue length: triggers:
- type: rabbitmq
  metadata:
    host: amqp://localhost:5672/vhost
    mode: QueueLength # QueueLength or MessageRate
    value: "100" # message backlog or publish/sec. target per instance
    queueName: testqueue Check the official website to see all the scalers. KEDA Closing words When we found KEDA, our pains with the Prometheus Adapter were solved instantly. KEDA's simple install experience and readymade scalers allowed us to cover our autoscaling needs, while its straightforward yaml syntax communicates well the scaling intent. We not just use KEDA ourselves, but also recommend it to our clients and friends. So much so that we integrated KEDA into our preferred stack at Gimlet. Onwards! Originally by Youcef Guichi and Laszlo Fogas at . published https://gimlet.io

The writer has or has previously had a business relationship with companies mentioned in this article.

Kubernetes Autoscaling Was Supposed to Be Easy

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The How and Why of Building Our SaaS Platform on Hetzner and Kubernetes

A Better Way to Monitor Your Laravel Services

A Guide to Kubernetes Autoscaling Tools with Linode Kubernetes Engine

Auto Scaling in Cloud Computing with AWS: Top 4 Advantages

Autoscaling Node.js Image Transformations Using Sharp and Express

AWS Auto Scaling Groups: Have Fun Learning AWS Through Comics

The How and Why of Building Our SaaS Platform on Hetzner and Kubernetes

A Better Way to Monitor Your Laravel Services

A Guide to Kubernetes Autoscaling Tools with Linode Kubernetes Engine

Auto Scaling in Cloud Computing with AWS: Top 4 Advantages

Autoscaling Node.js Image Transformations Using Sharp and Express

AWS Auto Scaling Groups: Have Fun Learning AWS Through Comics

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps