Derek Haynes

@dlite

Beyond `tail -f`: powering up your sleuthing skills with Logstash+Kibana

In every episode of Sherlock, there’s a scene where our namesake character rapidly breaks down a complex web of tangled, seemingly unrelated observations into a single clear analysis. Just like Sherlock, you’ve likely stared into the abyss of a seemingly untraceable bug, exception, or performance problem.

Often, they look like this:

  1. The app breaks for only a subset of customers
  2. The app is slow only during specific times of the day
  3. Only one database shard is exceptionally slow
  4. Only one app host is experiencing a problem

Digging through logs via tail, awk, grep, etc. for clues is a tedious and frequently futile effort. Fortunately, there is a way to quickly breakdown a complex problem with Sherlock-like authority: Kibana and the rest of the ELK stack.

Let’s setup an investigation super platform for a Rails app.

An overview of a Rails / Kibana stack

We’ll utilize a couple open-source external services to process and visualize our data: Logstash, Elasticsearch, and Kibana. Within our Rails app, we’ll need to log data in a Logstash-compatible format. Finally, we’ll need to ship the data to Logstash.

Let’s break these parts down.

The ELK stack

We’ll send our log data to Logstash. Logstash will extract data from our Rails logs (controller actions, response times, hostnames, IP addresses, and more) and store it in Elasticsearch.

We’ll use Kibana to query and visualize the log data stored in Elasticsearch.

Together, Elasticsearch, Logstash, and Kibana are referred to as the ELK stack. These services should be installed on a dedicated host: processing and querying event data is resource-intensive.

At Scout, we’ve used an AWS m4.xlarge to host the ELK stack. This gives us 7 days of logs for $215/mo.

For a small organization, $215 isn’t a trivial expense. However, we’ll shortly see how Kibana can track down problems in ways tail -f, awk, and grep can’t.

Shameless Promotion: Want an investigative platform without the moving parts? I’m helping develop an approachable, hosted Kibana alternative. Signup for early access.

Rails App Dependencies

We’ll install three gems to log events in a Logstash-compatible format.

Gem #1: Lograge. Lograge lets us shrink the large number of lines generated by the default Rails logger to log a single request into a single line:

method=GET path=/jobs/833552.json format=json controller=JobsController  action=show status=200 duration=58.33 view=40.43 db=15.26

Lograge supports multiple output formats, one of which is Logstash. Here’s how a single request looks in the Logstash format:

To use the Logstash format with Lograge, install gem #2: logstash-event.

Finally, gem #3:logstash-logger. This lets us log every message in a Logstash-compatible format. By themselves, lograge+logstash-event only log request events in Logstash format.

Rails Hosts

Finally, we need to take the logged data and ship it to Logstash. We’ll use node-logstash to do this. node-logstash is a NodeJS service that needs to be installed on each host our Rails app runs on.

Here’s an example of a node-logstash config file.

Configuration

The ELK Stack

We’ll focus on the Rails configuration vs. setting up the ELK parts. There are many tutorials available for this.

Lograge and Logstash gem configuration

Add the following to your Gemfile:

Configure Lograge via aconfig/initializers/lograge.rb file:

There’s one last step: for production, log everything in the Logstash format. This isn’t necessary for development. Modify config/environments/production.rb:

Let’s visualize

Our plain vanilla log files have been upgraded to a flexible, query-able, structured format. Access Kibana and you’ll see a stream of events and possible filters in the sidebar:

Here’s some handy search queries:

Adding custom fields

The default config is useful by itself, but the true power comes via custom fields. We can log the users, companies, database shards, etc associated with each request. Basically, anything that can impact performance that can’t be parsed from the default data sent by Lograge.

Here’s how we do this @ Scout via 2 parts:

1. Accumulate fields in a Thread local variable

We accumulate important fields to store in a Thread local variable via a before_filter. Here’s how we store the current user:

2. Subscribe to ‘process_action.action_controller’ events

These events are sent when Rails completes serving a request. We subscribe to these, adding additional fields collected while processing the request.

See the subscriber source.

Kibana: grouping by fields

We were a step forward just running queries and viewing counts. However, viewing a list of matching events isn’t that helpful when you’re trying to understand the composition of your app’s behavior. For example, which users are seeing the most 1 second+ response times?

We can do that via Kibana’s Visualize feature and aggregating by user_email:

Summary: More than a developer tool

At this point, you should be excited about the ways Kibana can improve your detective skills: keeping an eye on the experience of key customers, response times by host, ensuring trials users aren’t hitting exceptions, and more.

In short, Kibana breaks gives you the power to break a complex problem into into a collection of simple observations. Just like Sherlock.

Shameless Promotion #2: Want an investigative platform without the work? I’m helping develop an approachable Kibana alternative via a simple gem install. Signup for early access.
Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.
If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!

More by Derek Haynes

Topics of interest

More Related Stories