# Tries Interview Questions & Tips

By The Mighty Anomaly and Kenny Polyak | Published: July 3, 2023

## What Is a Trie?

A trie, also known as prefix tree, is a particular application of an n-ary tree - recall that an n-ary tree is just a generic tree with no restrictions - where each node represents a character in a string.

Most commonly, tries are used to store string data in an efficient manner, where each node is a character in the string starting at the root. The root node itself is merely an access node, and its children are all the first letters of the strings stored in the trie. Here's a simple trie storing the strings "ape", "ban", and "dad".

### Why a Trie?

The trie's strength really becomes apparent when it comes to storing strings with overlapping prefixes. Trees are directional and acyclic, meaning all the edges starting from the root will never move backward, and all the paths will end at a leaf. As such, when we store strings in a trie, the sequence of characters are represented by all the possible unique paths that can be traversed through the tree. This feature allows a tree to merge the prefixes that are common across multiple strings.

Let's imagine we want to store the following strings in a trie: "apple", "ape", "ban". In the corresponding trie, we can see that although there are three strings being stored, the root only has two children, since both "apple" and "ape" can share the "A" node. Further, both these strings can also share the "P" node, since they share the prefix "ap". At massive scale, a trie can greatly reduce the space needed to store strings with overlapping prefixes, which is particularly useful when storing something like a large dictionary.

Furthermore, we get efficiency gains when searching for a prefix in the trie. When searching for a string, we traverse down the tree from the root, checking if the next letter in our target string exists as a child on our current node. With every move down the tree, we reduce our search space to the remaining subtree. We'll take a closer look at the search implementation below.

## Trie Implementation

Since a trie is merely a particular application of an n-ary tree, there are many variations on how it can be implemented. Let's take a look at the common implementation using a hash map, and discuss some of the tradeoffs we can make with node storage.

### Trie Node

Tries can be implemented differently depending on the requirements of the problem, but the most common is with the use of hashmap. In loosely-typed languages, like JavaScript, we can even get away with not defining a Node class at all and instead simply using object literals throughout.

Here's a simple trie node class definition. Note that the node's children are stored in a hashmap, where the value of the child node is the key, which points to the actual TrieNode being referenced.

``````class TrieNode:
def __init__(self):
self._children = {}
self._data = None``````

Like with many data structure implementations, there is an opportunity to pre-process certain data to be stored directly on the node, at the expense of additional space. Let's take a look at a common problem involving tries - auto-complete - to help illustrate some of the variations on trie node implementation.

The auto-complete problem usually gives us a list of strings as a dictionary, an input prefix from the user, and the task of finding how many full strings can be composed with the given prefix.

### Dictionary

Since we're dealing with a dictionary, we already know that a trie is a preferred way to store and model the string data at scale, as well as search for a given prefix. One common implementation for storing words is to use a boolean to indicate if a node is at the end of a word. This way, when we are traversing the trie in search for a string, if the last letter of the target string falls on a node that is the end of a word, then the target exists.

Here's an example TrieNode:

``````class TrieNode:
def __init__(self):
# Assuming lowercase English letters
self._children = [None]*26
self._is_end_of_word = False
``````

### Auto-Complete

Now that we have a trie that allows us to find words, let's implement our auto-complete feature.

After building the trie, we could then implement a function on the TrieNode that traverses the trie from this node and tracks the paths to each leaf as well as nodes with the `isEndOfWord` boolean set to `true`. This list of paths would represent the words that can be formed using the input prefix.

Alternatively, we could pre-process this information whenever a node is added and cache this data on each node. When a string is inserted into the Trie, we perform the above traversal to get a list of all the possible words that can be formed using the input prefix - we then store this list directly on the node. As a result, when we find a prefix, we can get the auto-complete suggestions in constant time.

``````class TrieNode:
def __init__(self):
# Assuming lowercase English letters
self._children = [None]*26
self._is_end_of_word = False
self._words = []
``````

As you can imagine, these design decisions have time and space tradeoffs. Performing the necessary pre-processing to get the list of possible words is time intensive on insert, but can improve efficiency for read-heavy use-cases.

### Frequency Count, Prefixes, and Other Metadata

There are other pieces of metadata that are commonly pre-processed and stored on trie nodes:

• Frequency count: how many times a given string or prefix occurs in the trie. This can be useful for auto-pruning the trie.
• Prefixes: if you want to know the path that was taken to get to a given node in the trie, you can store the path prefix as the node value.

Like with implementing the dictionary and auto-complete, these trie nodes are updated upon insertion of a string: we store the path traversed so far on each step of the traversal and we increment the frequency count on the node we end on.

Let's take a closer look at some common trie operations in the next section!

## Search

Searching through a trie involves traversal following a specific path laid out by a target string. We can treat the input string as a desired path through the trie, starting at the children of the root node. Imagining this as a mere depth-first traversal through the tree, we recurse on each subsequent character in the target word, so long as it exists. If we run out of characters, we check if the current node is labeled as being the end of a word.

``````class Trie:
def __init__(self):
self.root = TrieNode()

def search(self, word):
curr = self.root

for c in word:
index = ord(c) - ord('a')

if curr._children[index] is None:
return False

curr = curr._children[index]

return curr is not None and curr._is_end_of_word
``````

### Time and Space Complexity

Time complexity: `O(n)`, where `n` is the length of the word. Space complexity: `O(m*k)`, where `m` is the number of words with an average length of `k`. This is because each character in each word would need a separate node in the trie. However, if there are common prefixes among the words, the space complexity can be significantly reduced.

## Insert

Insert is also a traversal where we're searching for the right place to put a string. Once there is no longer a "correct" place, we just create new nodes and set them as children.

``````class Trie:
def __init__(self):
self.root = TrieNode()

def search(self, word):
curr = self.root

for c in word:
index = ord(c) - ord('a')

if curr._children[index] is None:
return False

curr = curr._children[index]

return curr is not None and curr._is_end_of_word
``````

Let's imagine we're adding the word "application" to this tree from before:

In this case, we would not add any nodes until we reached the "L" that ends the prefix "appl". At this point, since there's no "I" neighbor on the current node, we create a new trie node with this new value. This process continues for the remaining characters in the input string.

### Time and Space Complexity

Time complexity: `O(n)`, where `n` is the length of the word. Space complexity: `O(n)`, where `n` is the length of the word.

### When to Use a Trie in Technical Interviews

Tries are popular topics in interviews because they do a great job of testing that you understand trees beyond the most basic level. Knowing when to use a trie requires a strong understanding of the real life use-cases for tries.

Whenever a problem involves overlapping prefixes of data that need to be efficiently accessed, think trie. Tries are ideal for problems involving dictionaries or string prefixes, as we can use the structure to efficiently search if a given word is in the dictionary as well as count its frequency. These types of problems include:

• Autocomplete: Tries are particularly well-suited for implementing autocomplete functionality or generating word suggestions based on partial input. Each node is a letter in a possible word, so when we traverse the trie by following the characters of our search string, we are eliminating significant portions of the search space with each decision. By storing a dictionary of words in a trie, you can efficiently find all words that match a given prefix or partial input.
• Spell Checking: In a similar fashion, tries can be utilized for spell-checking algorithms. If you store the dictionary words in a trie, you can easily check if a given word is spelled correctly or suggest alternative words based on the prefix by traversing.
• Word Frequency Counting: Tries can also be used for counting the frequency of words in a text or document. By incrementing a counter on each node that a prefix ends on, we can then efficiently keep track of and query the frequency of the words.

Another common case where prefixes are used to structure data - networking.

• IP Routing: Tries are commonly used in network routing algorithms, where they can efficiently store and search for network prefixes such as IP addresses. Tries are especially useful for tiered data sets where many strings contain common prefixes, since we don't need to store duplicate prefixes.
• Longest Prefix Matching: Tries can be used in scenarios where you need to find the longest matching prefix. This is commonly employed in network protocols, such as IP routing or IP address subnet matching. Once again, we're using the efficient searching capabilities in tries, thanks to the way that strings are implemented as traversal paths in the trie. Tries are popular topics in interviews because they do a great job of testing that you understand trees beyond the most basic level. Knowing when to use a trie requires a strong understanding of the real life use-cases for tries.

Whenever a problem involves overlapping prefixes of data that need to be efficiently accessed, think trie. Tries are ideal for problems involving dictionaries or string prefixes, as we can use the structure to efficiently search if a given word is in the dictionary as well as count its frequency. These types of problems include:

## Common Mistakes in Interviews Featuring Tries

• Neglecting to discuss the performance tradeoffs with node structure: Don't assume the interviewer is prioritizing time complexity. Although a problem might be efficiently solved for time complexity with a trie, be sure to mention the tradeoff in space complexity.
• Misunderstanding time and space complexity: Candidates often fail to properly analyze trie complexity, given the many variations in implementations. Preprocessing, while it does add some time expense on insertion, does not impact the worst cast time complexity. For space complexity, it can have an impact if we are storing strings with variable lengths, in which case we might imagine `O(m*n)` space complexity, where m is the average length of the words in the trie.
• Not knowing how to construct a trie: The most common way is to make a single class with a dictionary of dictionaries representing nodes and children, but it can also be implemented with an actual Node class.
• Neglecting case-sensitivity and character normalization: When handling strings, we need to always consider if we need to implement some character normalization, as this will address invalid characters as well as case sensitivity issues.
• Lack of frequency counter: When constructing a trie to store a list of strings, a common issue that arises is how to handle duplicates. In some cases, an interviewer (or the problem itself) will allow you to de-duplicate the original list, such that you don't need to worry about representing duplicates. But a likely follow-up will be to implement a solution that can track duplicates - this is a great opportunity to use frequency counters on the trie nodes, which specifically tracks the number of strings that end at a given node.

## About the Authors

The Mighty Anomaly

The Mighty Anomaly (Mike) is one of the top Google coaches for interviewing.io having personally helped over 100+ engineers get into Google with his unique coaching style. After receiving offers from the full FAANG gauntlet early in his career, he enjoys teaching others proven techniques to pass technical interviews with his decade of coaching and industry experience.

Kenny Polyak

Kenny is a software engineer and technical leader with four years of professional experience spanning Amazon, Wayfair, and U.S. Digital Response. He has taught courses on Data Structures and Algorithms at Galvanize, helping over 30 students land new software engineering roles across the industry, and has personally received offers from Google, Square, and TikTok.