Aggregate Logs with Elasticsearch, Kibana, Logstash & Docker

“Is that it?”
“No. That’s a wall.”
“It could be disguised.”
“You’re not very good at looking for things, are you?”
“I’m good at looking for walls. Look, I found another one.”
― Derek Landy, Kingdom of the Wicked

When a system was monolithic we had access to the full execution stack trace. However, in Microservices architecture, any single operation in any service can trigger a chain of downstream microservice calls, as all are isolated among themselves which leads to a challenging task to debug an actual flow.

And let’s be honest, we all hate those “

something went wrong

” or “

unknown

” system errors.

Well, this situation can be smoothly handled if we externalize and centralize the storage of our logs. Which I believe will increase our chances of tracking down and fixing issues.

Complete project can be found here :

https://github.com/vrushofficial/Utter-Microservices-Territory

Checklist

E-L-K Stack

ElasticSearch
Kibana
Beats
LogStash

Logs as Streams of events

Logging with LogBack & SLF4J
Enhancing log in with tracing and JSONformat

Example

Playing with Docker images
Spinning docker containers
Investigation of logs on Kibana server

E-L-K Stack

ELK stack is basically a combination of 4 open source softs for processing log files and storing them at a centralized place. It helps to identify the issues spanning through multiple servers by correlating their logs within a specific time frame.

ELK Stack: Elasticsearch, Logstash, Kibana | Elastic
What is the ELK Stack? The ELK Stack is an acronym for a combination of three widely used open source projects…www.elastic.co

Let’s have a quick look at each component of Elastic Stack.

Elasticsearch

Elasticsearch is a distributed analytics and search engine built over Apache Lucene, a Java-based search and indexing library. Elasticsearch provides distributed cluster features and sharding for Lucene indexes, advanced search and aggregation functionality, as well as high availability, security, a snapshot/restore module, and other data management features. The platform also provides such features as thread-pooling, node/cluster monitoring API, queues, data monitoring API, cluster management, etc.

Kibana

Kibana is a data visualization and management tool for Elasticsearch that provides real-time histograms, line graphs, pie charts, and maps. Kibana also includes advanced applications such as Canvas, which allows users to create custom dynamic infographics based on their data, and Elastic Maps for visualizing geospatial data.

Logstash

Logstash, one of the core products of the Elastic Stack, is used to aggregate and process data and send it to Elasticsearch. Logstash is an open source, server-side data processing pipeline that enables you to ingest data from multiple sources simultaneously and enrich and transform it before it is indexed into Elasticsearch.

Place All In Pipeline

The following illustration shows how the components of Elastic Stack interact with each other

In few Words:

FileBeats will collect logs from .log file and sends to buffering or one can directly send to Logstash.
Logstash enhance data within particular formate and sends to Elasticsearch.
Elasticsearch saves data and create inverted index.
Kibana works for visualization and analytics. It simplifies the huge volumes of data and reflects the real-time changes in the Elasticsearch queries.

Logs as Streams of events

Logs are the continuous events of aggregated, time-ordered events collected from the output streams of all running processes and backing services. Logs in their raw form are typically a text format with one event per line (though backtraces from exceptions may span multiple lines). Logs have no fixed beginning or end, but flow continuously as long as the app is operating.

By following this methodology logs can be continuously routed to a file or can be watched in real-time on terminal.

Logging with LogBack & SLF4J

Playing with spring boot applications Logback dependency will be pulled by default and will be used as a default logging mechanism. Logback preferably used with

SLF4J

just like following,

@Slf4j
public class Vrush{
   log.trace("Logging at TRACE level");
   log.debug("Logging at DEBUG level");
   log.info("Logging at INFO level");
   log.warn("Logging at WARN level");
   log.error("Logging at ERROR level");
//or any Parametrized logs
  log.debug("Found {} results", list.size());
}

Logback can be configured in the

logback-spring.xml

file, located under the

resources

folder. In this configuration file, we can take advantage of Spring profiles and the templating features provided by Spring Boot.

Enhancing log in with tracing and Json format

A simple user action might trigger a chain of downstream microservice calls. It would be tedious to trace the logs related to a particular user action across microservices. In addition to that, we might want to track down why a certain microservice call is taking so much time.

One solution to this is at the beginning of the call chain we can create a CORRELATION_ID and add it to all log statements. Along with it, send

CORRELATION_ID

as a header to all the downstream services as well so that those downstream services also use

CORRELATION_ID

in logs. This way we can identify all the log statements related to a particular action across services.

We can implement this solution using MDC feature of Logging frameworks. Typically we will have a WebRequest Interceptor where you can check whether there is a

CORRELATION_ID

header. If there is no

CORRELATION_ID

in the header then create a new one and set it in MDC. The logging frameworks include the information set in MDC with all log statements.

But, instead of doing all this work, we can use Spring Cloud Sleuth which will do all this and much more for us.

<dependency>
   <groupId>org.springframework.cloud</groupId>
   <artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

By adding Spring Cloud Sleuth dependency in pom.xml file, interactions with the downstream services will be instrumented automatically and the trace and span ids will be added to the SLF4J’s Mapped Diagnostic Context (MDC), which will be included in the logs.

Sleuth includes the pattern [appname,traceId,spanId,exportable] in logs from the MDC.

booking-service

logs you can find log statements something like:

2021–01–01 15:31:29.189 WARN [booking-service,903c472a08e5cda0] returning BAD REQUEST response for [email protected] request due Room 7513 is not available

consumer-service

logs you can find log statements something like:

2021–01–01 15:31:29.189 WARN[consumer service,903c472a08e5cda0,af68249ac3a6902] 64712 [io-8181-exec-12]

Observe that TraceID

903c472a08e5cda0

is the same in both booking-service and consumer-service for the same REST API call. This way we can easily correlate the logs across services.

Export logs in JSON

By default, logback stores logs in plain text format. And we are going to store logs in Elasticsearch which operates on JSON index. To accomplish it, we can use the Logstash Logback Encoder or LoggingEventCompositeJsonEncoder.

In general, the microservices will run in Docker containers so that we can leave the responsibility of writing the log files to Docker.

For a simple and quick configuration, we could use LogstashEncoder, which comes with a pre-defined set of providers:

<configuration>
    <springProperty scope="context" name="application_name"    source="spring.application.name"/>
    <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>
    <root level="INFO">
        <appender-ref ref="jsonConsoleAppender"/>
    </root>
    
</configuration>

The above configuration will produce the following log output (just bear in mind that the actual output is a single line, but it’s been formatted below for better visualization):

{
   "@timestamp": "2021-01-01T05:01:38.967+01:00",
   "@version": "1",
   "message": "Finding details of room with id 7052",
   "logger_name": "com.vrush.microservices.booking.service",
   "thread_name": "http-nio-8001-exec-3",
   "level": "WARN",
   "level_value": 20000,
   "application_name": "booking-service",
   "traceId": "c52d9ff782fa8f6e",
   "spanId": "c52d9ff782fa8f6e",
   "spanExportable": "false",
   "X-Span-Export": "false",
   "X-B3-SpanId": "c52d9ff782fa8f6e",
   "X-B3-TraceId": "c52d9ff782fa8f6e"
}

The composite encoder has no providers configured by default, so we must add the providers we want to customize the output. Which you can find on my github page.

It’s time to get your hands dirty :)

Running on Docker

As we will have multiple containers, we will use Docker Compose to manage them. Application’s all services are configured in a

YAML

file. Then, by running a shell script, we create and start all the services from our configuration. All you need to do is hit a

build.sh

And take a Coffee or two!!

In the given example have a look at how the services are defined and configured in the

docker-compose-backend-services.yml

. What’s important to highlight is the fact that labels have been added to some services. Labels are simply metadata that only have meaning for who’s using them. Let’s have a quick looks at the labels that have been defined for the services:

Filebeat will collect the logs produced by the Docker container by adding
```
collect_logs_with_filebeat=true
```
and will autodiscover the Docker containers that have this property -
```
decode_log_event_to_json_object=true.
```
Filebeat collects and stores the log event as a string in the message property of a JSON document.
If the events are logged as JSON, as described in above point that Filebeat should decode the JSON string stored in the message property to an actual JSON object.
Send the log events to Logstash which runs on the port
```
5044
```

When applications run on containers, they become moving targets to the monitoring system. So we’ll use the autodiscover feature from Filebeat, which allows it to track the containers and adapt settings as changes happen. By defining configuration templates, the autodiscover subsystem can monitor services as they start running.

Booking, Financial, Searching, Consumer services will produce logs to the standard output (

stdout

). By default, Docker captures the standard output (and standard error) of all your containers, and writes them to files in JSON format, using the json-file driver. The logs files are stored in the

/var/lib/docker/containers

directory and each log file contains information about only one container. In

filebeat.docker.yml

file, Filebeat is configured as following:

filebeat.autodiscover:
  providers:
    - type: docker
      labels.dedot: true
      templates:
        - condition:
            contains:
              container.labels.collect_logs_with_filebeat: "true"
          config:
            - type: container
              format: docker
              paths:
                -  "/var/lib/docker/containers/${data.docker.container.id}/*.log"
              processors:
                - decode_json_fields:
                    when.equals:
                      docker.container.labels.decode_log_event_to_json_object: "true"
                    fields: ["message"]
                    target: ""
                    overwrite_keys: true
output.logstash:
  hosts: "logstash:5044"

The above configuration uses a single processor. If we need, we could add more processors, which will be chained and executed in the order they are defined in the configuration file. Each processor receives an event, applies a defined action to the event, and the processed event is the input of the next processor until the end of the chain.

Once the log event is collected and processed by Filebeat, it is sent to Logstash, which provides a rich set of plugins for further processing the events.

The Logstash pipeline has two required elements, input and output, and one optional element, filter. The input plugins consume data from a source, the filter plugins modify the data as we specify, and the output plugins write the data to a destination.

In the

logstash.conf

file, Logstash is configured to:

Receive events coming from Beats in the port

. Process the events by adding the tag

logstash_filter_applied

sends the processed events to Elasticsearch which runs on the port

input {
  beats {
    port => 5044
  }
}
filter {
  mutate {
    add_tag => [ "logstash_filter_applied" ]
  }
}
output {
  elasticsearch {
    hosts => "elasticsearch:9200"
  }
}

Elasticsearch will store and index the log events and, finally, we will be able to visualize the logs in Kibana, which exposes a UI in the port

All over but the shouting : Example

Before starting, ensure you at least

Java 11

Maven 3.x

and

Docker

set up. Then clone the repository from GitHub:

git clone https://github.com/vrushofficial/Utter-Microservices-Territory

In given above example there are multiple service — booking, financial, searching, consumer. Each are producing external logs file in specified locations in docker container.

There are multiple

.yml

files created for creating docker images and all these YMLs are configured with docker compose from

build.sh

file.

Visualizing logs in KibanaOpen Kibana in your favourite browser:

http://localhost:5601

. Kibana comes with sample data in case we want to play with it. To explore the data generate by our applications, click the Explore on my own link. On the left hand side, click the Discover icon.

Then pick a field for filtering the data by time. Choose

@timestamp

and click the Create index pattern button.

The index pattern will be created. Click again in the Discover icon and the log events of booking service start up will be shown:

To filter log events from the booking service, for example, enter application_name : "booking-service" in the search box. Click the Update button and now you’ll see log events from the booking service only.

In the left-hand side, there’s a list of fields available. Hover over the list of fields and an Add button will be shown for each field. Add a few fields such as

application_name, trace.trace_id, trace.span_id

and message. Now let’s see how to trace a request. Pick a trace id from the logs and, in the filter box, input

trace.trace_id: "<value>"

where

<value>

is the trace id you want to use as filter criteria. Then click the Update button and you will able to see logs that match that trace id.

Further Notes

Configurations! and hopefully been able to understand and get your basics clear on all the terms and technologies that shall be used to implement the logs tracing application.

If you found this story helpful, please share to help others find it!

Come back here and tell me about the before-and-after. I bet you’ll have something to say!