is an open-source tool for automating node provisioning in Kubernetes. Karpenter aims to enhance both the effectiveness and affordability of managing workloads within a Kubernetes cluster. Karpenter The core mechanics of Karpenter involve: Monitoring ‘unschedulable’ pods identified by the scheduler. Kubernetes Scrutinizing the scheduling constraints, including resource requests, node selectors, affinities, tolerations, and topology spread constraints, as stipulated by the pods. Provisioning nodes that precisely align with the pods’ requirements. Streamlining cluster resource usage by removing nodes once their services are no longer required. Why did we move to Karpenter? At , we used AWS Fargate in the past for running on-demand, short-lived, or one-off workloads, one of the examples being running slaves in AWS Fargate while the Master runs on worker nodes. Fargate is good in the sense that it takes care of managing node infrastructure, but can cost premium if you’re using it for long-running workloads. It is for this reason that EKS deployment with Worker Nodes is the preferred path. But with that comes a new problem, unlike Fargate, we have to not just manage to create nodes and node groups, but we also have to ensure that our EC2 nodes utilization is optimal. It comes back to hurt us especially when we realize there’s an entire VM running on just 10% of the CPU/Mem capacity because it has two active pods, which we could have moved to another node and claimed this one. In the past, we’ve relied on a cocktail of Prometheus Alerts and Fluent-Bit monitoring data to conclude we can reschedule pods and clean up unused nodes. But any self-respecting Engineering Manager would tell you they’d jump to a better alternative than this as soon as they find one. For us, Karpenter was that alternative. NimbleWork Jenkins How it works? Karpenter allows you to define Provisioners which are the heart of its cluster management capability. When initially installing Karpenter, you establish a default Provisioner, which imparts specific constraints on the nodes created by Karpenter and the pods eligible to run on these nodes. These constraints encompass defining taints to restrict pod deployment on Karpenter-created nodes, establishing startup taints to indicate temporary node tainting, narrowing down node creation to preferred zones, instance types, and computer architectures, and configuring default settings for node expiration. The Provisioner, in essence, empowers you with fine-grained control over resource allocation within your Kubernetes cluster. You can read more on Provisioners . here Deploying EKS Cluster Here’s how to deploy the EKS cluster with Karpenter. Setting Up the VPC Before we begin, let’s deploy the AWS VPC to run our EKS cluster. we’ll be using Terraform for provisioning on the AWS Cloud. module "vpc" {
  source               = "terraform-aws-modules/vpc/aws"
  version              = "3.19.0"
  name                 = "mycluster-vpc"
  cidr                 = var.vpc_cidr
  azs                  = ["us-east-1a", "us-east-1b", "us-east-1c"]
  private_subnets      = var.private_subnets_cidr
  public_subnets       = var.public_subnets_cidr
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  public_subnet_tags = {
    "kubernetes.io/cluster/mycluster" = "shared"
    "kubernetes.io/role/elb"          = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/mycluster  = "shared"
    "kubernetes.io/role/internal-elb" = "1"
    "karpenter.sh/discovery"          = "mycluster"
  }
  tags = {
    "kubernetes.io/cluster/mycluster" = "shared"
  }
}
module "vpc-security-group" {
  source  = "terraform-aws-modules/security-group/aws"
  version = "4.17.1"
  create  = true
  name        = "mycluster-security-group"
  description = "Security group for VPC"
  vpc_id      = module.vpc.vpc_id
  ingress_with_cidr_blocks = var.ingress_rules
  ingress_with_self = [
    {
      from_port   = 0
      to_port     = 0
      protocol    = -1
      description = "Ingress with Self"
    }
  ]
  egress_with_cidr_blocks = [{
    cidr_blocks = "0.0.0.0/0"
    from_port   = 0
    to_port     = 0
    protocol    = -1
  }]
  tags = {
    Name                      = "mycluster-security-group"
    "karpenter.sh/discovery"  = "mycluster"
  }
} We’re using the community-contributed modules here for spinning up a VPC which has public and private subnets, and ingress rules. For those interested in more details here’s a simple example of what could potentially go in the ingress rules variable "ingress_rules" {
  type        = list(map(string))
  description = "VPC Default Security Group Ingress Rules"
  default = [
    {
      cidr_blocks = "0.0.0.0/0"
      from_port   = 443
      to_port     = 443
      protocol    = "tcp"
      description = "Karpenter ingress allow"
    },
    { #other  CIDR blocks to which you might want to restrict access to (for example if this was your dev cluster)
      cidr_blocks = "XX.XX.XX.XXX/XX"
      from_port   = 0
      to_port     = 0
      protocol    = -1
      description = "MyCLuster-NAT"
    }
  ]
} The tag in the VPC module and the in the security group tags is our hint to AWS about using aws-karpenter for autoscaling nodes and pods in this cluster. You can get the VPC up and running via the "karpenter.sh/discovery" = "mycluster" terraform plan
terraform apply commands, it’s a good practice to define key values that you will need in other modules as outputs to this module run, also, we save the state in an S3 bucket as our TF builds run from a Jenkins Salve on Fargate with ephemeral storage. You’d see the following values in the console output of the command if you’ve included publishing the VPC and security group IDs in the of your VPC module. terraform apply outputs.tf security_group_id = "sg-dkfjksdhf83983c883"
vpc_id = "vpc-2l4jc2lj4l2cbj42" With this we have our VPC ready, let’s deploy the EKS cluster with Node Groups and Karpenter. Deploying EKS Cluster with Node Group Workers and Karpenter Add the following code to your module to include EKS terraform module "eks-cluster" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "19.12.0"
  cluster_name    = "mycluster"
  cluster_version = 1.26
  subnet_ids      = [  "subnet-XX","subnet-YY","subnet-ZZ"]
  create_cloudwatch_log_group = false
  tags = {
    Name                      = "mycluster"
    "karpenter.sh/discovery"  = "mycluster"
  }

  vpc_id = "vpc-2l4jc2lj4l2cbj42"

  cluster_endpoint_public_access_cidrs = ["XX.XX.XX.XXX/YY"] #important if the cluster_endpoint_public_access is set to true
  cluster_endpoint_private_access      = true
  cluster_endpoint_public_access       = true
  cluster_security_group_id            = "sg-dkfjksdhf83983c883"
}

module "mycluster-workernodes" {
  source  = "terraform-aws-modules/eks/aws//modules/eks-managed-node-group"
  version = "19.12.0"

  name            = "${var.eks_cluster_name}-services"
  cluster_name    = module.eks-cluster.cluster_name
  cluster_version = module.eks-cluster.cluster_version
  create_iam_role = false
  iam_role_arn    = aws_iam_role.nodegroup_role.arn

  subnet_ids = flatten([data.terraform_remote_state.db.outputs.private_subnets])

  cluster_primary_security_group_id = "sg-dkfjksdhf83983c883"
  vpc_security_group_ids            = [module.eks-cluster.cluster_security_group_id]

  min_size     = 1
  max_size     = 5
  desired_size = 2

  instance_types     = ["t3.large"]
  capacity_type      = "ON_DEMAND"
  labels = {
    NodeGroups = "mycluster-workernodes"
  }

  tags = {
    Name                      = "mycluster-workernodes"
    "karpenter.sh/discovery"  = module.eks-cluster.cluster_name
  }
} It’s the same tag at play here too, and that’s it! You have an EKS cluster with Karpenter managed provisioning ready! "karpenter.sh/discovery" Configuring Karpenter Provisioners Now that we have a cluster ready let’s have a look at using Karpenter to manage the Pods. We’ll define provisioners for different purposes and then associate pods to each of them. Provisioner for Nodes running Spot Instances This is good alternative to Fargate, specially for running the one-off workloads which do not live beyond the job completion. Here’s an example of a Karpenter provisioner using spot instances. # spot default
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["spot"]
    - key: "karpenter.k8s.aws/instance-category"
      operator: In
      values: ["c", "m", "r"]
    - key: "karpenter.k8s.aws/instance-cpu"
      operator: In
      values: ["4", "8", "16", "32"]
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: default
  consolidation:
    enabled: true
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    karpenter.sh/discovery: mycluster
  securityGroupSelector:
    karpenter.sh/discovery: mycluster
--- To use this provisioner add the following tag to the in kube deployment. nodeSelector nodeSelector:
  karpenter.sh/provisioner-name: default This will provision the pods to run on spot instances. Provisioner for Nodes running On-Demand Instances Here’s a sample of how to use an on-demand node for worker nodes, and schedule pods on it. The following file defines a provisioner for on-demand instances # on-demand
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: on-demand
spec:
  # taints:
  #   - key: "name"
  #     value: "on-demand"
  #     effect: "NoSchedule"
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: "karpenter.k8s.aws/instance-category"
      operator: In
      values: ["c", "m", "r"]
    - key: "karpenter.k8s.aws/instance-cpu"
      operator: In
      values: ["2","4","8", "16", "32"]
    - key: "topology.kubernetes.io/zone"
      operator: NotIn
      values: ["us-east-1b"]
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: on-demand
  # consolidation:
  #   enabled: true
  ttlSecondsAfterEmpty: 30
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: on-demand
spec:
  subnetSelector:
    karpenter.sh/discovery: mycluster
  securityGroupSelector:
    karpenter.sh/discovery: mycluster
--- Once again, we can utilise the in kube deployment yaml to provision pods on these nodes nodeSelector nodeSelector:
  karpenter.sh/provisioner-name: on-demand Conclusion This is a simplified example of how to get started with Karpenter on AWS EKS. production-grade deployments require more nuanced provisioner definitions including but not limited to resource limits, and eviction policies as well. Also published here.

This story contains new, firsthand information uncovered by the writer.

The writer is smart, but don't just like, take their word for it. #DoYourOwnResearch before making any investment decisions or decisions regarding your health or security. (Do not regard any of this content as professional investment advice, or health advice)

Efficient Autoscaling for EKS Node Groups with Karpenter

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Guide to Building Observability for Microservices

A Better Way to Monitor Your Laravel Services

A Guide to Kubernetes Autoscaling Tools with Linode Kubernetes Engine

Auto Scaling in Cloud Computing with AWS: Top 4 Advantages

Autoscaling Node.js Image Transformations Using Sharp and Express

AWS Auto Scaling Groups: Have Fun Learning AWS Through Comics

A Guide to Building Observability for Microservices

A Better Way to Monitor Your Laravel Services

A Guide to Kubernetes Autoscaling Tools with Linode Kubernetes Engine

Auto Scaling in Cloud Computing with AWS: Top 4 Advantages

Autoscaling Node.js Image Transformations Using Sharp and Express

AWS Auto Scaling Groups: Have Fun Learning AWS Through Comics

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps