A Trie (usually pronounced “try”) is a tree data structure optimized for a specific type of searching. You use a Trie when you want to take a partial value and return a set of possible complete values. The classic example for this is an Autocomplete.
The image above shows how a Trie is structured, which hints at how it works. You can think of a Trie as a set of related values. What the values have in common is their prefix.
As you search for a more specific prefix, you get more specific return values. Using the Trie shown in the picture above, searching for matches to the prefix “b” will return 6 values: be, bear, bell, bid, bull, buy.
Searching for matches to the prefix “be” will return 2 values: bear, bell
When to use Tries
A Trie can be used whenever you want to match prefixes with possible complete values. This is how the Trie got its funny name. The word “Trie” is an infix of the word “retrieval”.
Tries are commonly used to implement things like:
- Autocompletes / typeaheads
- Spell checkers
You’re not limited to prefix matching words. Tries can store:
- IP Addresses,
- Phone Numbers,
- Objects (You can search properties on an object),
- and more…
Should you use Tries in a Frontend App?
- Does this structure give me performance gains? Are the performance gains worth it?
- Is this structure easier to use — or, at least, no more difficult?
- Does this structure provide my data with more semantics? Does it make my code easier to understand?
- How much of an impact will this structure have on my build size? Is this increase in build size worth it?
Contrasting Tries and Arrays
Here’s the criteria we’ll use to contrast Tries and Arrays:
- Performance (run time and load time)
- Ease of use, and readability
- Build size (Arrays add no extra code. We’ll analyze the Trie.)
- I wrote a quick autocomplete in React, using
create-react-app. Here’s what it looks like:
- I used Josh Jung’s
trie-searchas my Trie implementation.
- I used
fakerto generate a set of 10,000 names. I didn’t care whether the names were unique.
Here’s the base code for the autocomplete. Note, it doesn’t use any specific Collection data structure. Implementation details for the Trie and the Array, follow.
Here’s the Array-based code:
Here’s the TrieSearch-based code:
I tested two places in the code for performance:
- Loading the data items into the data structure.
- Searching the data structure for items.
All tests were conducted using Chrome 65.x.
Loading data items
Using a Trie, you’ll experience longer initialization times than an Array. Tries initialize in O(n*m) time. To give you a practical sense of this, it took an average of 90 milliseconds to add 10,000 items to the Trie in the test code. This is a one-time cost. Trie initialization can (i.e., should) be deferred until a page has loaded.
Most commonly, you’ll initialize the Trie from an Array. If you need to add more data, you can add items to the Trie manually.
Tries add good value when searching is one of the main focuses of an application. Because of their initialization cost, using a Trie may not be a great fit for your application if your users don’t search a lot.
In each test, I typed “Cath” into the autocomplete input, and measured how long the search took.
Array.filter. It was 50% faster.
Ease of Use
As the code examples above show, using TrieSearch was no more complicated than using an Array. However, there are differences to keep in mind:
For more details on using TrieSearch, consult is documentation.
TrieSearch is 7.4kb unminified. It has 13 dependencies. The overall production build size is ~10kb. Determining whether or not this will have a significant impact on your application is up to you.
The Trie is an excellent data structure when you have an application that does a lot of searching. Tries are relatively easy to use — and they’re fast.
Using a Trie over an Array has some performance costs, such as:
- Trie Initialization. (Because of this, it is recommended that you defer the initialization of a Trie until after the page has loaded).
- Larger bundle size.