Elasticsearch: Using Completion Suggester to build AutoComplete

Written by taranjeet | Published 2017/12/26
Tech Story Tags: elasticsearch | search | search-engines | autocomplete | completion-suggester

TLDRvia the TL;DR App

In an earlier post, we discussed various approaches to implement Autocomplete functionality. We came to a conclusion that Completion Suggester covers most of the cases required in implementing a fully functional and fast autocomplete. This post explains in detail about what is Completion Suggester and how to use them practically.

Completion Suggester

Completion Suggester is a type of suggester in Elasticsearch, which is used to implement autocomplete functionality. It uses an in-memory data structure called Finite State Transducer. Elasticsearch stores FST on a per segment basis, which means suggestions scale horizontally as more new nodes are added.

Mapping

To use Completion Suggester, a special type of mapping type called completion is defined. Let’s take an example of Marvel movie data and define an index named movies with type as marvels . Complete movie list can be accessed from here

Here name.completion is a type of completion field. In this field, we can add various other mapping parameters like analyzer, search_analyzer, etc.

Indexing Data

To index data, a slightly different syntax is used. A suggestion field is made of an input and an optional weight parameter. Let’s index a movie into our movies index.

We can also define aweight for each field. This weight can help us in controlling the ranking of documents when querying.

We can also index multiple suggestions for a document at the same time

Querying

To query document, we need to specify suggest type as completion. Let’s query for thor in our movies index. movies index contains all the 22 movies from Marvel Cinematic Section of this page.

We get the following movies as result

  • Thor
  • Thor: Ragnarok
  • Thor: The Dark World

We see that all documents are having _score as 1. This means that all the documents in completion suggestor are ranked equally. To give boost to a particular document, or to alter the ranking, we can use the optional parameter called weight. We have already indexed Iron Man(with no weight) and Iron Man 2(with weight as 2). Let’s search for Iron Man in our movies index.

We get the following movies as result

  • Iron Man 2 (score as 2)
  • Iron Man (score as 1).

We can clearly see here how weight is used to control the ranking of documents. This is the reason why Iron Man 2 is ranked higher than Iron Man when searched for Iron Man .

We can also specify thesize to control the number of documents returned.

We can also add fuzziness in completion suggester. This helps us in providing suggestions even when there is a typo. Let’s try searching for captain amrica the with fuzzy query

We get the following movies as result

  • Captain America: The First Avenger
  • Captain America: The Winter Soldier

Let’s try finding suggestion for movie names which contain america .

We get no results. This is because completion suggester support prefix matching. It starts matching from the start of the string and there is no movie which contains america at the start of the string. To deal with this type of situation, we can tokenize the input text on space and keep all the phrases as canonical names. This way Captain America: The First Avenger will be inserted as

Filtering Document

In queries, we can filter documents by using filter but filter does not work in Completion Suggester. To understand this better, let’s run a query which finds all movies with name iron man released in year 2008.

The response received looks like

In the response, we see that hits key along with suggest is present. This happened because query and suggest works at the same level parallely. Hence we get both keys in response. So we cannot apply filter in a suggestion query.

To deal with this, Completion Suggester provides Context Suggester, which are basically filters for completion field. Let’s define another mapping for movies index, this time with year as a context suggester for name field.

We can index our complete movies data into this index. Let’s find all movies with name iron man released in year 2008 .

We get the following movies as result

  • Iron Man

We can also boost context suggester as well. Let’s search for movies with name as iron man, released in year 2008and 2010, giving a boost of 4 to year 2008 .

We get the following movies as the result

  • Iron Man (score as 4)
  • Iron Man 2 (score as 1)

References


Published by HackerNoon on 2017/12/26