Custom TraceID in Elastic APM

Written by thamizh | Published 2020/08/01
Tech Story Tags: elastic-search | apm | golang | go | elasticsearch | http | elastic-apm | backend

TLDR Elastic APM is extensively useful in monitoring the lifecycle of a request in a system especially in µservices architecture. Golang is used in this article for code snippets but the concept can be extended to other languages as well. Elastic APM supports distributed tracing and it is OpenTracing compliant. The idea is to create a custom trace ID and expose that via header in response. Extracted TraceID can be logged and thus used in finding the request in APM dashboard. For example, I have used Traceid as the response header.via the TL;DR App

Elastic APM is extensively useful in monitoring the lifecycle of a HTTP request in a system especially in µservices architecture. Wide variety of web frameworks and databases are supported which is useful in tracking the request up to DB calls. The documentation is simple and concise which makes it easy to instrument the application.

This article aims to help or at least make it easy to trace the HTTP request lifecycle after instrumentation. Golang is used in this article for code snippets but the concept can be extended to other languages as well.
After instrumentation, if we look into APM dashboard of our service, we get the transaction distribution similar to the image below.
Now, there are eleven requests in a particular duration. So, how to find a particular request among them?
General technique used in microservices architecture to trace the request is Correlation ID.
So is there anything similar to correlation ID in elastic APM?
Well, Sort of.
Elastic APM supports distributed tracing and it is OpenTracing compliant. According to W3C standard, a http header named
Traceparent
is passed in requests which makes distributed tracing possible. Its format is defined by W3C and an example is shown below.
"version"-"trace-id"-"span-id"-"trace-flags" 
00-cfcc5fb87332693caae03bcdf41832f8-25d06f33a2fd577a-01    
Trace ID is similar to correlation ID and it is unique.
A search with TraceID provides that single request as shown below.
How to get this Trace ID then?
1. Using RUM (Real user monitoring). This javascript agent is used to instrument the frontend. It sends the
Traceparent
header for every request made from frontend and from the developer tools, trace ID can be selected from the header.
2. From application, we can extract this TraceID as shown in below code snippet. Extracted TraceID can be logged and thus used in finding the request in APM dashboard.
// from new transaction
func Foo(){
     apmTx := apm.DefaultTracer.StartTransaction("transaction_name", "transaction_type")
     fmt.Println(apmTx.TraceContext().Trace.String()) // prints TraceID
     apmTx.End()
}

// from incoming http request
func FooHandler(w http.ResponseWriter, r *http.Request) {
     apmTx := apm.TransactionFromContext(r.Context())
     fmt.Println(apmTx.TraceContext().Trace.String()) // prints TraceID
     apmTx.End()
}
Then why custom Trace ID?
What if my application didn't have frontend? How to trace the request which comes to my API server from third parties? And how to debug an issue in particular µservice? What If I want to instrument my µservices gradually but still want to track my request?
The idea is to create a custom trace id and expose that via header in response. Using that header we can search for that particular request in the APM dashboard. I have used Traceid as the response header.
Custom Trace ID is generated by a middleware which wraps the incoming http requests. If the trace id is already present in request, it is written in the outgoing response header. If not a new one is created.

The new trace ID is created using UUID. Span ID is the last 8 bytes of trace ID and the trace options should be set as recorded so that the elastic APM server will track the request. Then it is changed to
Traceparent
header format and attached to the request.
func SetTraceID(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		var traceID apm.TraceID
		if values := r.Header[apmhttp.W3CTraceparentHeader]; len(values) == 1 && values[0] != "" {
			if c, err := apmhttp.ParseTraceparentHeader(values[0]); err == nil {
				traceID = c.Trace
			}
		}
		if err := traceID.Validate(); err != nil {
			uuid := uuid.New()
			var spanID apm.SpanID
			var traceOptions apm.TraceOptions
			copy(traceID[:], uuid[:])
			copy(spanID[:], traceID[8:])
			traceContext := apm.TraceContext{
				Trace:   traceID,
				Span:    spanID,
				Options: traceOptions.WithRecorded(true),
			}
			r.Header.Set(apmhttp.W3CTraceparentHeader, apmhttp.FormatTraceparentHeader(traceContext))
		}

		w.Header().Set(TraceID, traceID.String())
		next.ServeHTTP(w, r)
	})
}
The code and examples are present in Github.
By using this method, requests from clients like Postman can also be traced as shown below.

Postman:
APM Dashboard:
References

Published by HackerNoon on 2020/08/01