paint-brush
Elastic Stack — A Brief Introductionby@urfielahi07
30,601 reads
30,601 reads

Elastic Stack — A Brief Introduction

by Urfeena ElahiJuly 13th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

You are reading the right blog post if you have heard of <a href="https://hackernoon.com/tagged/elastic-stack" target="_blank">Elastic Stack</a> and want to explore or if you are an absolute dummy. I am sure you won’t be so after reading this! Let’s understand what Elastic Stack is and why do you need it.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - Elastic Stack — A Brief Introduction
Urfeena Elahi HackerNoon profile picture

You are reading the right blog post if you have heard of Elastic Stack and want to explore or if you are an absolute dummy. I am sure you won’t be so after reading this! Let’s understand what Elastic Stack is and why do you need it.

What is Elastic Stack ?

ELK Stack or Elastic Stack — ELK stack has been re-branded as Elastic Stack. The ELK stack is an amazing and powerful collection of three open source projects - **E**lasticsearch, **L**ogstash, and **K**ibana. Despite each one of these three technologies being a separate project, they have been built to work exceptionally well together .

Elastic Stack is a complete end-to-end log analysis solution which helps in deep searching, analyzing and visualizing the log generated from different machines.

Log Analysis-Search-Visualize

Yes, you read it right! Elastic stack reliably and securely takes data from any source, in any format, and search, analyze, and visualize it in real time. Elastic Stack provides a strong mechanism to perform centralized logging which plays an important role in identifying the web server and/or application related problems. It lets you search through all the logs at a single place and identify the issues spanning through multiple servers by correlating their logs within a specific time frame found in IT environments including use cases for web analytics, business intelligence, compliance and security.

What is Elastic Stack used for?

In today’s data dominated world, irrespective of the size of the organization, huge amount of data constantly flows into your systems on daily basis. As your data set grows larger, your analytics will slow up, resulting in sluggish insights. A considerable amount of this data is composed of the company’s web server logs. Logs are one of the most important and often-neglected sources of information. Each log file contains invaluable pieces of information which are mostly unstructured and makes no or little sense. Without a careful and detailed analysis of this log data, an organization can remain oblivious to both opportunities as well as threats surrounding it. Sigh!

So, the BIG question for your big data is: how can you maintain valuable business insights! Right? Don’t worry, here is where you need a log analysis tool.

Elastic Stack Users

Now, I have a question for you. How do Microsoft, LinkedIn, Netflix, Facebook, and Cisco monitor their logs?

The answer is obvious. Yes, it is none other than ELK!

The power of Elastic Stack lies in its powerful components- Elasticsearch- Logstash- Kibana- Beats- X Pack. The stack also includes a paid component known as X-Pack and family of log shippers called Beats, which led Elastic to rename ELK as the Elastic Stack. To understand Elastic Stack better, you need to understand its components.

In simple terms, Elastic Stack work flow can be put like this:

Logstash along with family of Beats collect and parse logs( say NGINX logs for SEO and analysis of web traffic), and then this information is indexed and stored by Elasticsearch. Finally, Kibana presents the data in visualizations enabling us to provide decision-making insights. Isn’t is amazing?

Elastic Stack Use Cases

Enough introducing ELK by definitions, now let’s see where and how actually it helps in solving real life problems. Ranging from tailing a simple log file to a complete — complex — critical business analytics, ELK stack comes together for playing the role for you. A few of these scenarios wherein ELK relieves you from the associated headache are listed below:

Logging and Log Analysis

ELK Stack has become most popular open source platform for logging. Assume that you have to find an error. You need to log in to several machines and look at several log files. Now assume that you are maintaining larger applications distributed across several nodes. In that case, this process of searching in log files can become more tedious and messy. It is time to move beyond using Linux tools like grep.

https://digicm.wordpress.com/2014/12/31/mwd0701-log-management-with-elk/

Solution: From Beats, to Logstash, to Ingest Nodes, Elasticsearch gives you plenty of options for grabbing data wherever it lives and getting it indexed. Tail a Few Files, or Billions.

Some successful ELK log analytics use cases include:

  • Risk management
  • Market intelligence
  • E-commerce personalization
  • Compliance
  • Security analysis
  • Fraud detection

Metrics

Talking about metrics or analytics , what is it that comes to your mind instantly? Hint: It is a small 4-letter word but powerful!!

Yes, “DATA”

Let’s take an example of a university with multiple departments and associated faculties. Requirement is to find out the number of faculty members per department.

Solution: Elasticsearch’s Aggregations can help with finding new ways to look at the data. If you have departments and faculties indexed in Elasticsearch, you can use the terms aggregation to find the count of faculty members working in particular department. The request would look like this:

curl -XGET "http://localhost:9200/university/faculty/_search" -d'{   

Aggregations are requested using the aggregations or aggs keyword, department is the term to identify the result and the terms aggregation counts the different terms for the given field. I will talk about the syntax in later blog posts. You would get response something like this:








































{"took": 1,"timed_out": false,"_shards": {"total": 1,"successful": 1,"failed": 0},"hits": {"total": 86,"max_score": 0,"hits": []},"aggregations": {"department": {"buckets": [{"key": "Mathematics","doc_count": 16},{"key": "Information Technology","doc_count": 20},{"key": "Geo Informatics","doc_count": 25},{"key": "Zoology","doc_count": 15},{"key": "Bio Technology,"doc_count": 10}]}}}

We can see that there are 16 faculty members working in “Mathematics” department , 20 in “Information Technology” department and so on. This is all Elasticsearch’s search superpowers that are applied to the metrics.

Full Text Search

At the heart of ELK Stack is Elasticsearch being JSON-based and RESTful search engine designed for scaling millions of events per second providing maximum reliability.

Some real time uses are as:

  • Wikipedia: Elasticsearch is used by Wikipedia, the giant search provider, for full-text search.
  • Stack Overflow: Stack Overflow, the giant knowledge-sharing site relies on Elasticsearch as a means to support full-text search capabilities, thus providing source related questions and/or answers.
  • Netflix: Netflix deals with monitoring and analyzing customer service related operations and security related logs; and heavily relies on ELK for all this. Ranging from its automatic replication or sharding, nice extension model and flexible schema, ELK is responsible.
  • LinkedIn: LinkedIn, the business-focused social network uses ELK with kafka to support their load in real time and monitor performance and security.
  • Medium: Medium is my favorite and one of the most popular modern blog-publishing platforms whose stack supports around 25 million unique readers every month as well as tens of thousands of published posts each week. It uses Elastic Stack to debug production issues.
  • GitHub: The project host is capable of querieng billions of lines of code with the search engine.
  • Lyft: Lyft was Amazon’s biggest hosted Elastic Search customer. After moving from Amazon ES to Elasticsearch, it became self-managed and self-hosted for better performance.

These are just few to mention. If you are interested in exploring more, I recommend you to explore the Elastic Stack documentation for more clear understanding on https://www.elastic.co. and day-today use cases at https://www.elastic.co/use-cases

Feel free to give any suggestions and corrections in the comments below! :D