“Is that it?” “No. That’s a wall.” “It could be disguised.” “You’re not very good at looking for things, are you?” “I’m good at looking for walls. Look, I found another one.” ― Derek Landy, Kingdom of the Wicked hen a system was monolithic we had access to the full execution stack trace. However, in Microservices architecture, any single operation in any service can trigger a chain of downstream microservice calls, as all are isolated among themselves which leads to a challenging task to debug an actual flow. W And let’s be honest, we all hate those “ ” or “ ” system errors. something went wrong unknown Well, this situation can be smoothly handled if we externalize and centralize the storage of our logs. Which I believe will increase our chances of tracking down and fixing issues. Complete project can be found here : https: //github.com/vrushofficial/Utter-Microservices-Territory Checklist E-L-K Stack ElasticSearch Kibana Beats LogStash Logs as Streams of events Logging with LogBack & SLF4J Enhancing log in with tracing and JSONformat Example Playing with Docker images Spinning docker containers Investigation of logs on Kibana server E-L-K Stack ELK stack is basically a combination of 4 open source softs for processing log files and storing them at a centralized place. It helps to identify the issues spanning through multiple servers by correlating their logs within a specific time frame. www.elastic.co ELK Stack: Elasticsearch, Logstash, Kibana | Elastic What is the ELK Stack? The ELK Stack is an acronym for a combination of three widely used open source projects… Let’s have a quick look at each component of Elastic Stack. Elasticsearch Elasticsearch is a distributed analytics and search engine built over , a Java-based search and indexing library. Elasticsearch provides distributed cluster features and sharding for Lucene indexes, advanced search and aggregation functionality, as well as high availability, security, a snapshot/restore module, and other data management features. The platform also provides such features as thread-pooling, node/cluster monitoring API, queues, data monitoring API, cluster management, etc. Apache Lucene Kibana Kibana is a data visualization and management tool for Elasticsearch that provides real-time histograms, line graphs, pie charts, and maps. Kibana also includes advanced applications such as Canvas, which allows users to create custom dynamic infographics based on their data, and Elastic Maps for visualizing geospatial data. Logstash Logstash, one of the core products of the Elastic Stack, is used to aggregate and process data and send it to Elasticsearch. Logstash is an open source, server-side data processing pipeline that enables you to ingest data from multiple sources simultaneously and enrich and transform it before it is indexed into Elasticsearch. Place All In Pipeline The following illustration shows how the components of Elastic Stack interact with each other In few Words: FileBeats will collect logs from .log file and sends to buffering or one can directly send to Logstash. Logstash enhance data within particular formate and sends to Elasticsearch. Elasticsearch saves data and create inverted index. Kibana works for visualization and analytics. It simplifies the huge volumes of data and reflects the real-time changes in the Elasticsearch queries. Logs as Streams of events Logs are the continuous events of aggregated, time-ordered events collected from the output streams of all running processes and backing services. Logs in their raw form are typically a text format with one event per line (though backtraces from exceptions may span multiple lines). Logs have no fixed beginning or end, but flow continuously as long as the app is operating. By following this methodology logs can be continuously routed to a file or can be watched in real-time on terminal. Logging with LogBack & SLF4J Playing with spring boot applications Logback dependency will be pulled by default and will be used as a default logging mechanism. Logback preferably used with just like following, SLF4J @Slf4j
public {
   log.trace( );
   log.debug( );
   log.info( );
   log.warn( );
   log.error( ); log.debug( , list.size());
} class Vrush "Logging at TRACE level" "Logging at DEBUG level" "Logging at INFO level" "Logging at WARN level" "Logging at ERROR level" //or any Parametrized logs "Found {} results" Logback can be in the file, located under the folder. In this configuration file, we can take advantage of Spring profiles and the templating features provided by Spring Boot. configured logback-spring.xml resources Enhancing log in with tracing and Json format A simple user action might trigger a chain of downstream microservice calls. It would be tedious to trace the logs related to a particular user action across microservices. In addition to that, we might want to track down why a certain microservice call is taking so much time. One solution to this is at the beginning of the call chain we can create a and add it to all log statements. Along with it, send as a header to all the downstream services as well so that those downstream services also use in logs. This way we can identify all the log statements related to a particular action across services. CORRELATION_ID CORRELATION_ID CORRELATION_ID We can implement this solution using feature of Logging frameworks. Typically we will have a WebRequest Interceptor where you can check whether there is a header. If there is no in the header then create a new one and set it in MDC. The logging frameworks include the information set in MDC with all log statements. MDC CORRELATION_ID CORRELATION_ID But, instead of doing all this work, we can use which will do all this and much more for us. Spring Cloud Sleuth <dependency> <artifactId>spring-cloud-starter-sleuth< dependency> org.springframework.cloud < > groupId </ > groupId /artifactId>
</ By adding Spring Cloud Sleuth dependency in pom.xml file, interactions with the downstream services will be instrumented automatically and the trace and span ids will be added to the SLF4J’s (MDC), which will be included in the logs. Mapped Diagnostic Context Sleuth includes the pattern in logs from the MDC. [appname,traceId,spanId,exportable] In logs you can find log statements something like: booking-service – – : : WARN [booking-service, c472a08e5cda0] returning BAD REQUEST response vrushofficial@gmail.com request due Room is not available 2021 01 01 15 31 29.189 903 for 7513 In logs you can find log statements something like: consumer-service – – : : WARN[consumer service, c472a08e5cda0,af68249ac3a6902] [io -exec ] 2021 01 01 15 31 29.189 903 64712 -8181 -12 Observe that TraceID is the same in both booking-service and consumer-service for the same REST API call. This way we can easily correlate the logs across services. 903c472a08e5cda0 Export logs in JSON By default, logback stores logs in plain text format. And we are going to store logs in Elasticsearch which operates on JSON index. To accomplish it, we can use the or LoggingEventCompositeJsonEncoder. Logstash Logback Encoder In general, the microservices will run in Docker containers so that we can leave the responsibility of writing the log files to Docker. For a simple and quick configuration, we could use LogstashEncoder, which comes with a : pre-defined set of providers <configuration> <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>
    <root level="INFO">
        <appender-ref ref="jsonConsoleAppender"/>
    </root>
    
</configuration> < = = = /> springProperty scope "context" name "application_name" source "spring.application.name" The above configuration will produce the following log output (just bear in mind that the actual output is a single line, but it’s been formatted below for better visualization): { : , : , : , : , : , : , : , : , : , : , : , : , : , : } "@timestamp" "2021-01-01T05:01:38.967+01:00" "@version" "1" "message" "Finding details of room with id 7052" "logger_name" "com.vrush.microservices.booking.service" "thread_name" "http-nio-8001-exec-3" "level" "WARN" "level_value" 20000 "application_name" "booking-service" "traceId" "c52d9ff782fa8f6e" "spanId" "c52d9ff782fa8f6e" "spanExportable" "false" "X-Span-Export" "false" "X-B3-SpanId" "c52d9ff782fa8f6e" "X-B3-TraceId" "c52d9ff782fa8f6e" The composite encoder has no providers configured by default, so we must add the we want to customize the output. Which you can find on my page. providers github It’s time to get your hands dirty :) Running on Docker As we will have multiple containers, we will use to manage them. Application’s all services are configured in a file. Then, by running a shell script, we create and start all the services from our configuration. All you need to do is hit a Docker Compose YAML build.sh . And take a Coffee or two!! In the given example have a look at how the services are defined and configured in the . What’s important to highlight is the fact that have been added to some services. Labels are simply metadata that only have meaning for who’s using them. Let’s have a quick looks at the labels that have been defined for the services: docker-compose-backend-services.yml labels Filebeat will collect the logs produced by the Docker container by adding and will autodiscover the Docker containers that have this property - collect_logs_with_filebeat=true decode_log_event_to_json_object=true. Filebeat collects and stores the log event as a string in the message property of a JSON document. If the events are logged as JSON, as described in above point that Filebeat should decode the JSON string stored in the message property to an actual JSON object. Send the log events to Logstash which runs on the port 5044 When applications run on containers, they become moving targets to the monitoring system. So we’ll use the feature from Filebeat, which allows it to track the containers and adapt settings as changes happen. By defining configuration templates, the autodiscover subsystem can monitor services as they start running. autodiscover Booking, Financial, Searching, Consumer services will produce logs to the standard output ( ). By default, Docker captures the standard output (and standard error) of all your containers, and writes them to files in JSON format, using the json-file driver. The logs files are stored in the directory and each log file contains information about only one container. In file, Filebeat is configured as following: stdout /var/lib/docker/containers filebeat.docker.yml filebeat.autodiscover:
  providers:
    - type: docker
      labels.dedot: templates:
        - condition:
            contains:
              container.labels.collect_logs_with_filebeat: config:
            - type: container
              format: docker
              paths:
                - processors:
                - decode_json_fields:
                    when.equals:
                      docker.container.labels.decode_log_event_to_json_object: fields: [ ]
                    target: overwrite_keys: output.logstash:
  hosts: true "true" "/var/lib/docker/containers/${data.docker.container.id}/*.log" "true" "message" "" true "logstash:5044" The above configuration uses a single processor. If we need, we could add more processors, which will be chained and executed in the order they are defined in the configuration file. Each processor receives an event, applies a defined action to the event, and the processed event is the input of the next processor until the end of the chain. Once the log event is collected and processed by Filebeat, it is sent to Logstash, which provides a rich set of plugins for further processing the events. The Logstash pipeline has two required elements, input and output, and one optional element, filter. The consume data from a source, the modify the data as we specify, and the write the data to a destination. input plugins filter plugins output plugins In the file, Logstash is configured to: logstash.conf Receive events coming from Beats in the port . Process the events by adding the tag sends the processed events to Elasticsearch which runs on the port 5044 logstash_filter_applied 9200 input {
  beats {
    port => }
}
filter {
  mutate {
    add_tag => [ ]
  }
}
output {
  elasticsearch {
    hosts => }
} 5044 "logstash_filter_applied" "elasticsearch:9200" Elasticsearch will store and index the log events and, finally, we will be able to visualize the logs in Kibana, which exposes a UI in the port . 5601 All over but the shouting : Example Before starting, ensure you at least , and set up. Then clone the repository from GitHub: Java 11 Maven 3.x Docker git clone https: //github.com/vrushofficial/Utter-Microservices-Territory In given above example there are multiple service — booking, financial, searching, consumer. Each are producing external logs file in specified locations in docker container. There are multiple files created for creating docker images and all these YMLs are configured with docker compose from file. .yml build.sh Visualizing logs in KibanaOpen Kibana in your favourite browser: . Kibana comes with sample data in case we want to play with it. To explore the data generate by our applications, click the link. On the left hand side, click the icon. http://localhost:5601 Explore on my own Discover Then pick a field for filtering the data by time. Choose and click the button. @timestamp Create index pattern The index pattern will be created. Click again in the icon and the log events of booking service start up will be shown: Discover To filter log events from the booking service, for example, enter application_name : "booking-service" in the search box. Click the button and now you’ll see log events from the booking service only. Update In the left-hand side, there’s a list of fields available. Hover over the list of fields and an button will be shown for each field. Add a few fields such as and message. Now let’s see how to trace a request. Pick a trace id from the logs and, in the filter box, input where is the trace id you want to use as filter criteria. Then click the button and you will able to see logs that match that trace id. Add application_name, trace.trace_id, trace.span_id trace.trace_id: "<value>" <value> Update Further Notes Configurations! and hopefully been able to understand and get your basics clear on all the terms and technologies that shall be used to implement the logs tracing application. If you found this story helpful, please share to help others find it! Come back here and tell me about the before-and-after. I bet you’ll have something to say! Hasta La Vista Baby!

Flow

Chain

Apache

Aggregate Logs with Elasticsearch, Kibana, Logstash & Docker

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

Why Dockerizing Applications is the Key to Building Scalable Software

10 Free Courses to Learn Docker for Programmers and DevOps Engineers

10 Free Courses to learn Docker and DevOps for Programmers and Software Developers

10 DevOps Conferences You Should Be Attending in 2020

10 disruptive DevOps trends of 2018

Why Dockerizing Applications is the Key to Building Scalable Software

10 Free Courses to Learn Docker for Programmers and DevOps Engineers

10 Free Courses to learn Docker and DevOps for Programmers and Software Developers

10 DevOps Conferences You Should Be Attending in 2020

10 disruptive DevOps trends of 2018

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps