In this blog post, you will learn how to build a complete open-source solution for extracting and shipping traces, metrics, and logs, and correlating between them. The solution proposed uses open-source tools: Grafana, Prometheus, Tempo, and Loki as an observability backend stack, and Odigos as an observability control plane.
If you are new to observability, or just interested in the difference between monitoring and observability we recommend watching this short video by the creator of OpenTelemetry. In short, distributed traces, metrics, and logs, with the ability to correlate one signal to another, are the best practice for debugging production issues when working with microservices-based applications. This is exactly what we are going to achieve for our demo application.
There is no need to learn any new technologies in order to implement and enjoy observability. With some basic Kubernetes commands — you are ready to get started.
We are going to deploy 3 different systems on our Kubernetes cluster:
The following tools are required to run this tutorial:
Create a new local Kubernetes cluster, by running the following command:
kind create cluster
We will install a fork of bank-of-athnos, an example of a bank application created by Google. We use a modified version without any instrumentation code to demonstrate how Odigos automatically collects observability data from the application.
Deploy the application using the following command:
kubectl apply -f https://raw.githubusercontent.com/keyval-dev/bank-of-athnos/main/release/kubernetes-manifests.yaml
As there is currently no one database that can store traces, logs, and metrics, we will deploy three different databases alongside Grafana as a visualization tool.
The following helm chart deploy Tempo (traces database), Prometheus (metrics database), and Loki (logs database) as well as a preconfigured Grafana instance with those databases as data sources. Install the helm chart by executing:
helm install --repo https://keyval-dev.github.io/charts observability oss-observability --namespace observability --create-namespace
Now that our test application is running, our observability databases are deployed and ready to receive data, the last piece of the puzzle is to extract and ship logs, metrics, and traces from our applications to the observability databases. The simplest and easiest way to do it is by using Odigos- a control plane for observability data. Install Odigos via the helm chart by executing the following commands:
helm repo add odigos https://keyval-dev.github.io/odigos-charts/
helm install my-odigos odigos/odigos --namespace odigos-system --create-namespace
After all the pods in the odigos-system
namespace are running, open the Odigos UI by running the following command:
kubectl port-forward svc/odigos-ui 3000:3000 -n odigos-system
And navigate to http://localhost:3000 to access the UI.
There are two ways to select which applications Odigos should instrument:
For this tutorial, we recommend choosing the opt out mode.
The next step is to tell Odigos how to reach the three databases that we deployed earlier. Add the following three destinations:
In order to add another destination, select Destinations from the sidebar and click Add New Destination
Wait a few seconds for Odigos to finish deploying the required collectors and instrument the target applications. You can monitor the progress by running
kubectl get pods -w
Wait for all the pods to be in Running
state (especially notice the transaction service application which has a slow startup time).
The last step is to explore our observability data in Grafana. We can now see and correlate metrics to traces to logs in order to dive deeply into how our application behaves.
Port forward to your Grafana instance by running:
kubectl port-forward svc/observability-grafana -n observability 3000:80
And navigating to http://localhost:3000
kubectl get secret -n observability observability-grafana -o jsonpath=”{.data.admin-password}” | base64 --decode
Let’s start by viewing a service graph of our microservices application:
Now let’s view some metrics. Click on the contacts node from the service graph and choose Request rate
A graph similar to the following should be presented:
There are many more metrics that Odigos collect and can be queried easily from the Prometheus data source, check out this document for the full list.
Click on the contacts application again in the Service Graph but this time choose Request Histogram. In order to correlate metrics to traces, we will use a feature called exemplars. To show exemplars:
Hover over one of the added points and click Query With Tempo. A trace similar to the following should be presented:
In this trace, you can see exactly how much time each part of the entire request took. Digging into one of the sections will show additional information such as database queries.
To further investigate specific actions you can simply query the relevant logs by pressing on the small document icon. Press on the document icon next to the balance reader to show the relevant logs:
We have shown how easy it is to extract and ship logs, traces, and metrics using only open-source solutions. In addition, we were also able to generate traces, metrics, and logs from an application within minutes. We now also have the ability to correlate between the different signals: We correlated metrics to traces and traces to logs. We now have all the needed data to quickly detect and fix production issues in our target applications.
Notice that the observability backend that we installed is not suited for production usage. For high volumes of data, it is recommended to persist those databases to cloud storage like S3 or use a managed offering.
Also Published here