“Is that it?”
“No. That’s a wall.”
“It could be disguised.”
“You’re not very good at looking for things, are you?”
“I’m good at looking for walls. Look, I found another one.”
― Derek Landy, Kingdom of the Wicked
When a system was monolithic we had access to the full execution stack trace. However, in Microservices architecture, any single operation in any service can trigger a chain of downstream microservice calls, as all are isolated among themselves which leads to a challenging task to debug an actual flow.
And let’s be honest, we all hate those “
something went wrong
” or “unknown
” system errors.Well, this situation can be smoothly handled if we externalize and centralize the storage of our logs. Which I believe will increase our chances of tracking down and fixing issues.
Complete project can be found here :
https://github.com/vrushofficial/Utter-Microservices-Territory
E-L-K Stack
Logs as Streams of events
Example
ELK stack is basically a combination of 4 open source softs for processing log files and storing them at a centralized place. It helps to identify the issues spanning through multiple servers by correlating their logs within a specific time frame.
ELK Stack: Elasticsearch, Logstash, Kibana | Elastic
What is the ELK Stack? The ELK Stack is an acronym for a combination of three widely used open source projects…www.elastic.co
Let’s have a quick look at each component of Elastic Stack.
Elasticsearch is a distributed analytics and search engine built over Apache Lucene, a Java-based search and indexing library. Elasticsearch provides distributed cluster features and sharding for Lucene indexes, advanced search and aggregation functionality, as well as high availability, security, a snapshot/restore module, and other data management features. The platform also provides such features as thread-pooling, node/cluster monitoring API, queues, data monitoring API, cluster management, etc.
Kibana is a data visualization and management tool for Elasticsearch that provides real-time histograms, line graphs, pie charts, and maps. Kibana also includes advanced applications such as Canvas, which allows users to create custom dynamic infographics based on their data, and Elastic Maps for visualizing geospatial data.
Logstash, one of the core products of the Elastic Stack, is used to aggregate and process data and send it to Elasticsearch. Logstash is an open source, server-side data processing pipeline that enables you to ingest data from multiple sources simultaneously and enrich and transform it before it is indexed into Elasticsearch.
The following illustration shows how the components of Elastic Stack interact with each other
In few Words:
Logs are the continuous events of aggregated, time-ordered events collected from the output streams of all running processes and backing services. Logs in their raw form are typically a text format with one event per line (though backtraces from exceptions may span multiple lines). Logs have no fixed beginning or end, but flow continuously as long as the app is operating.
By following this methodology logs can be continuously routed to a file or can be watched in real-time on terminal.
Playing with spring boot applications Logback dependency will be pulled by default and will be used as a default logging mechanism. Logback preferably used with
just like following,SLF4J
@Slf4j
public class Vrush{
log.trace("Logging at TRACE level");
log.debug("Logging at DEBUG level");
log.info("Logging at INFO level");
log.warn("Logging at WARN level");
log.error("Logging at ERROR level");
//or any Parametrized logs
log.debug("Found {} results", list.size());
}
Logback can be configured in the
logback-spring.xml
file, located under the resources
folder. In this configuration file, we can take advantage of Spring profiles and the templating features provided by Spring Boot.A simple user action might trigger a chain of downstream microservice calls. It would be tedious to trace the logs related to a particular user action across microservices. In addition to that, we might want to track down why a certain microservice call is taking so much time.
One solution to this is at the beginning of the call chain we can create a CORRELATION_ID and add it to all log statements. Along with it, send
CORRELATION_ID
as a header to all the downstream services as well so that those downstream services also use CORRELATION_ID
in logs. This way we can identify all the log statements related to a particular action across services.We can implement this solution using MDC feature of Logging frameworks. Typically we will have a WebRequest Interceptor where you can check whether there is a
CORRELATION_ID
header. If there is no CORRELATION_ID
in the header then create a new one and set it in MDC. The logging frameworks include the information set in MDC with all log statements.But, instead of doing all this work, we can use Spring Cloud Sleuth which will do all this and much more for us.
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
By adding Spring Cloud Sleuth dependency in pom.xml file, interactions with the downstream services will be instrumented automatically and the trace and span ids will be added to the SLF4J’s Mapped Diagnostic Context (MDC), which will be included in the logs.
Sleuth includes the pattern [appname,traceId,spanId,exportable] in logs from the MDC.
In
booking-service
logs you can find log statements something like:2021–01–01 15:31:29.189 WARN [booking-service,903c472a08e5cda0] returning BAD REQUEST response for [email protected] request due Room 7513 is not available
In
consumer-service
logs you can find log statements something like:2021–01–01 15:31:29.189 WARN[consumer service,903c472a08e5cda0,af68249ac3a6902] 64712 [io-8181-exec-12]
Observe that TraceID
903c472a08e5cda0
is the same in both booking-service and consumer-service for the same REST API call. This way we can easily correlate the logs across services.By default, logback stores logs in plain text format. And we are going to store logs in Elasticsearch which operates on JSON index. To accomplish it, we can use the Logstash Logback Encoder or LoggingEventCompositeJsonEncoder.
In general, the microservices will run in Docker containers so that we can leave the responsibility of writing the log files to Docker.
For a simple and quick configuration, we could use LogstashEncoder, which comes with a pre-defined set of providers:
<configuration>
<springProperty scope="context" name="application_name" source="spring.application.name"/>
<appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
</appender>
<root level="INFO">
<appender-ref ref="jsonConsoleAppender"/>
</root>
</configuration>
The above configuration will produce the following log output (just bear in mind that the actual output is a single line, but it’s been formatted below for better visualization):
{
"@timestamp": "2021-01-01T05:01:38.967+01:00",
"@version": "1",
"message": "Finding details of room with id 7052",
"logger_name": "com.vrush.microservices.booking.service",
"thread_name": "http-nio-8001-exec-3",
"level": "WARN",
"level_value": 20000,
"application_name": "booking-service",
"traceId": "c52d9ff782fa8f6e",
"spanId": "c52d9ff782fa8f6e",
"spanExportable": "false",
"X-Span-Export": "false",
"X-B3-SpanId": "c52d9ff782fa8f6e",
"X-B3-TraceId": "c52d9ff782fa8f6e"
}
The composite encoder has no providers configured by default, so we must add the providers we want to customize the output. Which you can find on my github page.
Running on Docker
As we will have multiple containers, we will use Docker Compose to manage them. Application’s all services are configured in a
YAML
file. Then, by running a shell script, we create and start all the services from our configuration. All you need to do is hit a build.sh
.And take a Coffee or two!!
In the given example have a look at how the services are defined and configured in the
. What’s important to highlight is the fact that labels have been added to some services. Labels are simply metadata that only have meaning for who’s using them. Let’s have a quick looks at the labels that have been defined for the services:docker-compose-backend-services.yml
collect_logs_with_filebeat=true
and will autodiscover the Docker containers that have this property - decode_log_event_to_json_object=true.
5044
When applications run on containers, they become moving targets to the monitoring system. So we’ll use the autodiscover feature from Filebeat, which allows it to track the containers and adapt settings as changes happen. By defining configuration templates, the autodiscover subsystem can monitor services as they start running.
Booking, Financial, Searching, Consumer services will produce logs to the standard output (
stdout
). By default, Docker captures the standard output (and standard error) of all your containers, and writes them to files in JSON format, using the json-file driver. The logs files are stored in the /var/lib/docker/containers
directory and each log file contains information about only one container. In filebeat.docker.yml
file, Filebeat is configured as following:filebeat.autodiscover:
providers:
- type: docker
labels.dedot: true
templates:
- condition:
contains:
container.labels.collect_logs_with_filebeat: "true"
config:
- type: container
format: docker
paths:
- "/var/lib/docker/containers/${data.docker.container.id}/*.log"
processors:
- decode_json_fields:
when.equals:
docker.container.labels.decode_log_event_to_json_object: "true"
fields: ["message"]
target: ""
overwrite_keys: true
output.logstash:
hosts: "logstash:5044"
The above configuration uses a single processor. If we need, we could add more processors, which will be chained and executed in the order they are defined in the configuration file. Each processor receives an event, applies a defined action to the event, and the processed event is the input of the next processor until the end of the chain.
Once the log event is collected and processed by Filebeat, it is sent to Logstash, which provides a rich set of plugins for further processing the events.
The Logstash pipeline has two required elements, input and output, and one optional element, filter. The input plugins consume data from a source, the filter plugins modify the data as we specify, and the output plugins write the data to a destination.
In the
file, Logstash is configured to:logstash.conf
Receive events coming from Beats in the port
5044
. Process the events by adding the tag logstash_filter_applied
sends the processed events to Elasticsearch which runs on the port 9200
input {
beats {
port => 5044
}
}
filter {
mutate {
add_tag => [ "logstash_filter_applied" ]
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
Elasticsearch will store and index the log events and, finally, we will be able to visualize the logs in Kibana, which exposes a UI in the port
5601
.Before starting, ensure you at least
Java 11
, Maven 3.x
and Docker
set up. Then clone the repository from GitHub:git clone https://github.com/vrushofficial/Utter-Microservices-Territory
In given above example there are multiple service — booking, financial, searching, consumer. Each are producing external logs file in specified locations in docker container.
There are multiple
.yml
files created for creating docker images and all these YMLs are configured with docker compose from build.sh
file.Visualizing logs in KibanaOpen Kibana in your favourite browser:
. Kibana comes with sample data in case we want to play with it. To explore the data generate by our applications, click the Explore on my own link. On the left hand side, click the Discover icon.http://localhost:5601
Then pick a field for filtering the data by time. Choose
@timestamp
and click the Create index pattern button.The index pattern will be created. Click again in the Discover icon and the log events of booking service start up will be shown:
To filter log events from the booking service, for example, enter application_name : "booking-service" in the search box. Click the Update button and now you’ll see log events from the booking service only.
In the left-hand side, there’s a list of fields available. Hover over the list of fields and an Add button will be shown for each field. Add a few fields such as
application_name, trace.trace_id, trace.span_id
and message. Now let’s see how to trace a request. Pick a trace id from the logs and, in the filter box, input trace.trace_id: "<value>"
where <value>
is the trace id you want to use as filter criteria. Then click the Update button and you will able to see logs that match that trace id. Configurations! and hopefully been able to understand and get your basics clear on all the terms and technologies that shall be used to implement the logs tracing application.
If you found this story helpful, please share to help others find it!
Come back here and tell me about the before-and-after. I bet you’ll have something to say!