Elasticsearch is a distributed, open-source, highly scalable search and analytics engine built on Apache Lucene and developed in Java. It allows you to store, search, and analyze huge volumes of data quickly and in near real-time and give back answers in milliseconds. With it’s fast search responses, flexible structure, and extensive API, it’s no wonder it’s being utilized for a growing number of use-cases from simple text search to log analysis.
Knowi is an analytics platform that natively integrates with Elasticsearch so you can leverage the speed of Elasticsearch to visualize large amounts of data from multiple indexes rapidly. Knowi allows you to query Elasticsearch directly using its native DSL, or use a drag-and-drop interface to build queries quickly without prior knowledge of the query syntax. Knowi also natively integrates with over 30 data sources, allowing you to join your Elasticsearch data across indexes, or blend with other SQL/NoSQL/REST-API sources on the fly to create new datasets that can be used for downstream analytics. From there, you can choose from a host of visualizations options to create custom interactive dashboards, run ad-hoc analysis, use search-based analytics to ask questions from your data, and more.
This post is an end-to-end tutorial on using Knowi for Elasticsearch analytics. We’ll start by natively connecting to data in your Elasticsearch cluster. From there, we’ll show you how to create visualizations from it in just a few minutes, perform joins across multiple indexes, and use Knowi’s search-based analytics feature to ask questions from your data in plain English to glean instant insights.
Sign up for a free Knowi account here to get started.
In this section, we’ll step through using the Knowi UI to connect to your Elasticsearch cluster in the cloud to visualize and analyze data from it.
Connecting to Elasticsearch
Knowi has broad native integration to other NoSQL, SQL, REST-API and JSON/CSV data sources. To get started, select your data source and configure the connection. Your data stays in the source so there are no ETL processes to build or ODBC drivers to install.
After logging in to Knowi, we’ll start by establishing a connection to your Elasticsearch data source.
Steps:
Connecting to Elasticsearch cluster (Source - knowi.com)
Once connected to your Elasticsearch cluster, Knowi automatically pulls a list of your indexes along with field samples. To start building your queries, Knowi gives you the option to auto-generate your queries using its drag and drop Query Builder via the UI. This is especially useful for users not as familiar with the native DSL. For more advanced users, you also have the option to write your queries directly in the smart Query Editor, a versatile text editor specialized for editing code.
In this example, we’ll select the sending_activity index (which contains email sending activity data) and select the fields we want to analyze from the auto-generated fields from the Query Builder.
Steps:
Use Query Builder to generate queries or write queries directly with the Query Editor (Source - knowi.com)
Once the query is saved, Knowi creates a “Virtual Dataset” from the query results and stores it in Knowi’s “Elastic Store” data warehouse that can store and track the results. Unlike traditional warehouses that require complex ETL processes and pre-defined schema, the elastic store is a flexible, scalable, schema-less warehouse. The stored virtual dataset is reusable, and will be the foundation for most of what you’ll do in Knowi, like creating visualizations, adding them to dashboards, and much more.
In this example, we’ll create a stacked column bar chart that shows the total sent emails for each customer by message type. First, we’ll create a new dashboard then the chart itself.
Creating A New Dashboard
Steps:
Creating a New Dashboard in Knowi (Source - knowi.com)
The Analyze Screen
Steps:
The Widget Analyze screen (Source - knowi.com)
Visualization Settings
The Visualization Settings screen (Source - knowi.com)
Drilldowns allow you to visually navigate and analyze data in powerful ways. They can be set into another widget, another dashboard, or the same dashboard. Drilldowns can be many levels deep with support for combining different drilldown modes. Data from the parent widget can be used as keys into the drilldown widget or dashboard to filter the data specifically for the point selected. Drilldowns can be configured using the ‘Drilldowns’ menu option on each widget in the dashboard.
In this example, we’ll set up a “Widget” drilldown from the stacked column chart (Parent) widget into the original data grid chart that filters the results based on a specific customer.
Steps:
Adding a Drilldown (Source - knowi.com)
Being part of the “ELK Stack” it's no surprise Kibana is considered the default visualization tool for Elasticsearch. However, its drawback is that each visualization can only work against a single index. So if you have indices with strictly different data, you’ll have to create separate visualizations for each.
Knowi provides a solution for this, as it allows you to join your Elasticsearch data across multiple indexes and blend it with other SQL/NoSQL/REST-API datasources, then create visualizations from it on the fly with a user-friendly UI.
In the following steps, we’ll join our initial sending_activity index with another index in our cluster with customer-specific information to create a new combined dataset that can be used for downstream analytics and visualizations.
Joining Your Indexes
Since we’ve already created a query for the sending_activity index, let’s go back and edit it to add a join to our second index sending_activity_customer.
Steps:
Use the Query Builder to select metrics from your second index (Source - knowi.com)
So far, we have the query from our first index that gives us the customer, the type of email sent, and how many were sent, opened, and converted. In our second index query, we get address information from the same customers. Now, it’s time to join these two indexes together.
Use the Join Builder to combine your indexes (Source - knowi.com)
In the new combined dataset, we have customer, message_type, sent, opened, conversions, and date fields from our first index and the street and state fields form our second index, joined on the key field customer. As you can see, we were able to easily run the queries from each side of the join then combine them to get the results with just a few clicks. We can now use this combined dataset to create new reports and visualizations.
Knowi’s search-based analytics is a powerful Google-search-like feature, allowing you to type in questions from your data in plain English and get answers instantly. This is especially useful for non-technical end users, allowing them to gain quick insights from the data even without prior knowledge of the underlying data structure or query syntax of the datasource.
In the following steps, we’ll use a brief example of using Knowi’s search-based analytics to ask questions from the blended email sending activity dataset we created in the previous section.
Steps:
Type question in the NLP Text Bar to find the maximum emails sent (Source - knowi.com)
Steps:
Type question in the NLP Text Bar to find out Customer address (Source - knowi.com)
Steps:
Type question in the NLP Text Bar to find conversion rate by customer weekly (Source - knowi.com)
As we’ve seen, by simply typing in questions in plain English, we were able to get answers back instantly from our combined Email Sending Activity dataset. You also have the option to take these results and create new widgets that can be added to your dashboard.
Summary
In summary, we used Knowi to seamlessly connect and write native queries on data stored in your Elasticsearch cluster then create visualizations from it in minutes, demonstrate how you can perform joins on multiple indexes in your cluster on the fly, and used its search-based analytics feature to ask questions from your data without the need for prior knowledge of the underlying query language. Visit Knowi to learn more about how its analytics capabilities can leverage the strengths of your Elasticsearch implementation.