How To Create a Simple Autocomplete Field And Connect it With Elasticsearch

Autocomplete is a feature to predict the rest of a word a user is typing. It is an important feature to implement that can improve the user’s experience of your product.

Creating an autocomplete might sound daunting at first if you’ve never created one. But with the help of the features in Elasticsearch, it’s actually a simple thing to do.

Things You Should Know

If you have little knowledge of Elasticsearch, I suggest that you read my other articles first. We do not require this, but knowing how an analyzer and a text field work definitely will help you understand this article.

The article “Basics of Elasticsearch for Developer” will introduce you to Elasticsearch. The article “Elasticsearch: Text vs. Keyword” will teach you the difference between text and keyword in Elasticsearch and also will explain how Elasticsearch’s analyzer works.

Setup

Creating the index

First, let’s create an index called

autocomplete-example

. We will use this index for the examples in this article.

Defining a mapping

Before indexing a document, let’s first define a mapping. We will only need one field,

simple_autocomplete

, with field data type text and will use a standard analyzer.

Since Elasticsearch uses the standard analyzer as default, we need not define it in the mapping.

Indexing a document

Let’s index a document. For the examples in this article, we will only need one document, containing the text “Hong Kong.”

Querying the Index With match Query

Let’s start with the query that we normally use,

match_query

The

standard analyzer

will lowercase your indexed text and split the text to tokens on stop words before storing it to an inverted index.

The

match_query

by default will use the index-time analyzer, so the analyzer it uses is the same as the one indexed in the index, which is

standard analyzer

Let’s see how our “Hong Kong” text looks in the inverted index with the API provided by the Elasticsearch:

When we do a search query to the index with match query, we will only get a result when we type text containing either “Hong” or “Kong.” This is because Elasticsearch only returns a result when the analyzed query is an exact match with a token in the inverted index.

If the user type “Ho” or “Kon” or “Hon Kon,” there won’t be any response from Elasticsearch.

For an autocomplete, this one isn’t very useful to help the user, right? At the least, autocomplete needs to show something, even if we do not type the full words.

To fix it, we can use a

match_phrase_prefix

query provided by Elasticsearch.

Using match_phrase_prefix Query

match_phrase_prefix

query will allow the user to get a result without typing all the words. By using the usual match query, we won’t get any result from the Elasticsearch if we type “Hon” or “Kon,” but with

match_pharse_prefix

, we can get a result.

There is still a shortcoming of this autocomplete: If the user types “Hon Kon,” it still won’t return any result. This is because “Hon Kon” is not the prefix of “Hong Kong”.

The Pros and Cons

An autocomplete with a text field data type and the standard analyzer is very simple, but it has pros and cons that you can consider before using this type of autocomplete.

Pros

Easy to no setup: You don’t even have to define any mapping because by default, if you index a text document into Elasticsearch, it will get mapped into the text and keyword field data types.
Fast index time: Because this type of autocomplete is using the standard analyzer, it doesn’t process your text much when saving it to the inverted index, which translates to fast index time.
Enough most of the time: Most of the time, you don’t need a complex autocomplete. This autocomplete type will be enough.

Cons

Can’t handle typos: This type of autocomplete can’t handle typos, so if the user types one wrong word, it won’t return any result.
The query can’t start from the middle word: The text queried to this type of autocomplete also can’t start from the middle. In the previous example of “Hong Kong,” if we do a query with text “ong kong,” the Elasticsearch won’t return anything.
Can’t handle space character: If we had mistakenly typed “HongKong” in the previous example, the Elasticsearch wouldn’t have returned anything with this type of autocomplete.

When to Use

I recommend an autocomplete with only the standard analyzer when you only need a simple autocomplete. You can also use this type of autocomplete if the index you want to create an autocomplete of is already in production and indexed with documents. Since this autocomplete uses the default analyzer and default mapping for text, it will work for most text documents.

Conclusion

Creating an autocomplete with the text field data type and standard analyzer is the simplest and easiest autocomplete that we can build with Elasticsearch. It requires almost no setup and can usually create an autocomplete for an existing index.

Even if it’s enough for most use cases, it still has many weaknesses because it can only handle simple queries. To overcome that, we can use a custom-defined analyzer or the Suggesters feature in Elasticsearch, which I plan to write about. Please wait for it!

At last, I want to say thank you to you for reading this article until the end. I hope this article will help you with your project.

References

Previously published at https://codecurated.com/blog/create-a-simple-autocomplete-with-elasticsearch/

How To Create a Simple Autocomplete Field And Connect it With Elasticsearch

Too Long; Didn't Read

Things You Should Know

Setup

Querying the Index With match Query

Using match_phrase_prefix Query

The Pros and Cons

When to Use

Conclusion

References

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

How To Create a Simple Autocomplete Field And Connect it With Elasticsearch

Too Long; Didn't Read

Things You Should Know

Setup

Querying the Index With match Query

Using match_phrase_prefix Query

The Pros and Cons

When to Use

Conclusion

References

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES