Tries: The Data Structure Behind Autocomplete
How tries (prefix trees) work internally, insert/search/prefix operations, space trade-offs, and when you should just use a hash set instead.
A trie (pronounced "tree" by some, "try" by others — the debate will never end) is a tree-shaped data structure where each node represents a character, and paths from root to nodes spell out strings. The name comes from "retrieval," which tells you exactly what it's good at.
If you've ever wondered how your phone keyboard suggests words after you type two letters, a trie is one of the core data structures that makes that possible.
How It Works Internally
A trie node typically looks like this:
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.is_end = False # marks end of a complete word
The root node is empty. Each edge represents a character. To store the word "cat", you'd have a path: root -> 'c' -> 'a' -> 't', with is_end = True on the 't' node.
Store "car" too, and the 'c' and 'a' nodes are shared. Only 'r' branches off as a new child of 'a', alongside 't'. That's the whole insight — common prefixes share structure.
Core Operations
Insert
Walk down the trie character by character. If a child for the current character exists, follow it. If not, create a new node. When you've processed all characters, mark the final node as is_end = True.
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
Time: O(m) where m is the word length. Can't beat that.
Search
Same traversal as insert, but you're just following existing edges. If you hit a missing child, the word isn't there. If you reach the end, check is_end — because "car" being in the trie doesn't mean "ca" is a stored word.
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end
Prefix Search (startsWith)
Identical to search, but you don't check is_end. If you can walk the entire prefix without hitting a dead end, at least one word with that prefix exists.
def starts_with(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
This is what makes tries special. A hash set can tell you if "catalog" exists, but it can't efficiently tell you all words starting with "cat". A trie answers that in O(prefix length) time, then you can DFS from that node to collect all completions.
The Space Problem
Here's the thing nobody mentions in the intro lecture: tries can be enormous.
If you're using a dictionary (hash map) for children, you're paying per-node overhead for each node object plus the hash map. For a large English dictionary, this can be 5-10x more memory than just storing the strings in an array.
If you use a fixed-size array of 26 children (for lowercase English), you're allocating 26 pointers per node even when most are null. That's better for speed (direct indexing) but potentially worse for space if the trie is sparse.
Compressed tries (also called radix trees or Patricia tries) help — they collapse chains of single-child nodes into one node storing a substring instead of a single character. "category" doesn't need 8 nodes if no other word shares that path beyond "cat".Where Tries Actually Get Used
Autocomplete systems — type a prefix, get suggestions. This is the poster child use case. Spell checkers — combine with edit distance to suggest corrections for misspelled words. IP routing — longest prefix matching on binary representations of IP addresses. Routers use specialized tries (binary tries) for this. Word games — Boggle solvers, Scrabble helpers. You need to quickly check "is this a valid prefix?" to prune the search space. A hash set can't do that efficiently. XOR problems — a niche but common interview pattern. Store numbers bit-by-bit in a trie, then find maximum XOR pairs by greedily choosing the opposite bit at each level.When a Hash Set Is Just Better
If all you need is "does this exact word exist?" — use a hash set. O(1) average lookup, way less memory, simpler code.
Tries win specifically when you need prefix operations: find all words with a prefix, count words with a prefix, longest common prefix, etc. If the problem doesn't involve prefixes, a trie is probably overkill.
The Interview Angle
The Trie implementation question ("Implement Trie with insert, search, and startsWith") is one of the most common medium-difficulty problems. It's straightforward once you understand the structure, but it's a gatekeeper — if you haven't seen a trie before, you'll struggle to invent one on the spot.
Beyond the basic implementation, tries show up in:
- Word Search II — search a board for multiple words simultaneously using a trie to prune paths
- Longest Word in Dictionary — build incrementally using trie traversal
- Map Sum Pairs — trie with values at nodes for prefix sum queries
These problems test whether you can adapt the basic trie structure to serve a specific purpose, not just implement it from a textbook.
Practice building tries from scratch on CodeUp — the interactive environment makes it easy to visualize what your trie looks like at each step and verify that prefix operations behave as expected.