paint-brush
ALB Controller Logs With Vector and OpenSearch: A Guideby@ivandrago
470 reads
470 reads

ALB Controller Logs With Vector and OpenSearch: A Guide

by Ivan DragoDecember 12th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

One of most strange logging setup i ever had - AWS ALB controller logs. Really, look at the scheme: AWS ALb Logs Pipeline. I assume that you already have an ALB Controller in your EKS Cluster. So, we should create S3 bucket and enable SQS trigger, with Terraform.

Company Mentioned

Mention Thumbnail
featured image - ALB Controller Logs With Vector and OpenSearch: A Guide
Ivan Drago HackerNoon profile picture

Greetings Comrades!

One of the strangest logging setups I ever had - AWS ALB controller logs. Really, look at the scheme:

AWS ALB Logs Pipeline


1. S3 Bucket for ALB Logs

I assume that you already have an ALB Controller in your EKS Cluster.


So, we should create an S3 bucket and enable the SQS trigger with Terraform:


###
# ALB Controller Logs S3 Bucket
###

resource "aws_s3_bucket" "alb_logs" {
  bucket = "alb-logs"
}

resource "aws_s3_bucket_lifecycle_configuration" "alb_logs" {
  bucket = aws_s3_bucket.alb_logs.id

  rule {
    id = "log"

    expiration {
      days = 3
    }

    status = "Enabled"
  }
}

resource "aws_s3_bucket_policy" "alb_logs" {
  bucket = aws_s3_bucket.alb_logs.id
  policy = data.aws_iam_policy_document.alb_logs_bucket_policy.json
}

data "aws_iam_policy_document" "alb_logs_bucket_policy" {
  statement {
    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::127311923021:root"] <--- it should be like this!!!
    }
    actions   = ["s3:*"]   <--- I`m so lazy
    resources = ["${aws_s3_bucket.alb_logs.arn}/*", ]
  }
}

2. S3 Trigger to SQS Queue

###
# ALB Logs SQS Queue
###

resource "aws_sqs_queue" "alb_logs" {
  name                    = "alb-logs"
  sqs_managed_sse_enabled = false
}

data "aws_iam_policy_document" "alb_logs_policy" {
  statement {
    effect = "Allow"

    principals {
      type        = "*"   <--- Wow again!!!
      identifiers = ["*"]   <--- And again!!!
    }

    actions   = ["sqs:SendMessage"]
    resources = [aws_sqs_queue.alb_logs.arn]

    condition {
      test     = "ArnEquals"
      variable = "aws:SourceArn"
      values   = [aws_s3_bucket.alb_logs.arn]
    }
  }
}

resource "aws_sqs_queue_policy" "alb_logs" {
  queue_url = aws_sqs_queue.alb_logs.id
  policy    = data.aws_iam_policy_document.alb_logs_policy.json
}

resource "aws_s3_bucket_notification" "alb_logs" {
  bucket = aws_s3_bucket.alb_logs.id

  queue {
    id        = "alb-logs"
    queue_arn = aws_sqs_queue.alb_logs.arn
    events    = ["s3:ObjectCreated:*"]
  }
}


3. Installing and Configuring Vector

I always use ArgoCD for installing something in the Kubernetes cluster, so here is a manifest:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: vector
  namespace: argocd
spec:
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
    retry:
      limit: 2
      backoff:
        duration: 2s
        factor: 2
        maxDuration: 1m
  destination:
    namespace: logging
    server: 'https://kubernetes.default.svc'
  source:
    repoURL: https://helm.vector.dev
    chart: vector
    targetRevision: 0.25.0
    helm:
      values: |
        service:
          enabled: false
        serviceAccount:
          annotations:
            eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ID:role/eks-alb-ingress-controller
        customConfig:
          transforms:
            alb:
              type: remap
              inputs:
                - s3
              source: >-
                structured =
                  parse_regex!(.message, r'^(?P<Type>[^ ]*) (?P<Time>[^ ]*) (?P<Alb>[^ ]*) (?P<ClientIP>[^ ]*):(?P<ClientPort>[0-9]*) (?P<TargetIP>[^ ]*)[:-](?P<TargetPort>[0-9]*) (?P<RequestProcessingTime>[-.0-9]*) (?P<TargetProcessingTime>[-.0-9]*) (?P<ResponseProcessingTime>[-.0-9]*) (?P<StatusCode>|[-0-9]*) (?P<TargetStatusCode>-|[-0-9]*) (?P<ReceivedBytes>[-0-9]*) (?P<SentBytes>[-0-9]*) \"(?P<RequestVerb>[^ ]*) (?P<RequestUrl>.*) (?P<RequestProto>- |[^ ]*)\" \"(?P<UserAgent>[^\"]*)\" (?P<SslCipher>[A-Z0-9-_]+) (?P<SslProtocol>[A-Za-z0-9.-]*) (?P<TargetGroupArn>[^ ]*) \"(?P<TraceId>[^\"]*)\" \"(?P<DomainName>[^\"]*)\" \"(?P<CertArn>[^\"]*)\" (?P<MatchedRulePriority>[-.0-9]*) (?P<RequestCreationTime>[^ ]*) \"(?P<ActionExecuted>[^\"]*)\" \"(?P<RedirectUrl>[^\"]*)\" \"(?P<LambdaErrorReason>[^ ]*)\" \"(?P<TargetPortList>[^\s]+?)\" \"(?P<TargetStatusCodeList>[^\s]+)\" \"(?P<Classification>[^ ]*)\" \"(?P<Classificationreason>[^ ]*)\"')
                . = merge(., structured)

                .ReceivedBytes = to_int!(.ReceivedBytes)
                .RequestProcessingTime = to_float!(.RequestProcessingTime)
                .ResponseProcessingTime = to_float!(.ResponseProcessingTime)
                .StatusCode = to_int!(.StatusCode)
                .SentBytes = to_int!(.SentBytes)
                .TargetProcessingTime = to_float!(.TargetProcessingTime)

                parsed_url, err = parse_url(.RequestUrl)
                if err == null {
                  .Path = parsed_url.path
                  .RequestDomainName = parsed_url.host
                }
          sources:
            s3:
              type: aws_s3
              compression: auto
              strategy: sqs
              region: "us-east-1"
              sqs:
                delete_message: true
                poll_secs: 5
                visibility_timeout_secs: 300
                queue_url: "https://sqs.us-east-1.amazonaws.com/YOUR_ID/alb-logs"
          sinks:
            opensearch:
              type: elasticsearch
              endpoints:
                - "https://YOUR_OPENSEARCH_URL.us-east-1.es.amazonaws.com"
              inputs:
                - alb

4. Enabling Our Logs

There are three ways you can enable logging:

  • Old-fashioned documentation from AWS - link


  • With Terraform aws_lb resource - link


  • And the easiest way is - Kubernetes Ingress annotation (our way):


alb.ingress.kubernetes.io/load-balancer-attributes: access_logs.s3.enabled=true,access_logs.s3.bucket=alb-logs


Now, you should be able to see logs in your OpenSearch console!


That`s all!