Before attempting this problem, you should be comfortable with:
Autocomplete works by finding sentences that share a common prefix with what the user has typed so far. A trie is ideal for prefix-based lookups. We build a trie where each node stores all sentences that pass through it along with their frequencies. As the user types, we walk down the trie character by character. At any node, we have direct access to all matching sentences and can sort them by frequency (highest first) and then alphabetically to return the top 3 results.
c is #, add the completed sentence to the trie, reset the current sentence and node, and return an empty list.c to the current sentence. If c is not a child of the current node, move to the dead node and return an empty list.c. Retrieve all sentences stored at this node, sort them by frequency descending and then alphabetically, and return the top 3.class TrieNode:
def __init__(self):
self.children = {}
self.sentences = defaultdict(int)
class AutocompleteSystem:
def __init__(self, sentences: List[str], times: List[int]):
self.root = TrieNode()
for sentence, count in zip(sentences, times):
self.add_to_trie(sentence, count)
self.curr_sentence = []
self.curr_node = self.root
self.dead = TrieNode()
def input(self, c: str) -> List[str]:
if c == "#":
curr_sentence = "".join(self.curr_sentence)
self.add_to_trie(curr_sentence, 1)
self.curr_sentence = []
self.curr_node = self.root
return []
self.curr_sentence.append(c)
if c not in self.curr_node.children:
self.curr_node = self.dead
return []
self.curr_node = self.curr_node.children[c]
sentences = self.curr_node.sentences
sorted_sentences = sorted(sentences.items(), key = lambda x: (-x[1], x[0]))
ans = []
for i in range(min(3, len(sorted_sentences))):
ans.append(sorted_sentences[i][0])
return ans
def add_to_trie(self, sentence, count):
node = self.root
for c in sentence:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.sentences[sentence] += countTime complexity:
constructor:
input:
currSentence and the trie, both cost . Next, we fetch and sort the sentences in the current node. Initially, a node could hold sentences. After we call input times, we could add new sentences. Overall, there could be up to sentences, so a sort would cost .input is called times, which gives us a total of Space complexity:
Where is the length of
sentences, is the average length of all sentences, and is the number of timesinputis called.
Sorting all matching sentences every time is expensive when there are many matches. Since we only need the top 3 results, we can use a heap to find them more efficiently. By building a min-heap of size 3, we process each sentence once and keep only the best candidates. In languages with linear-time heapify (like Python's heapq.nsmallest), this gives a performance boost over full sorting.
c is #, add the completed sentence to the trie, reset state, and return empty.c and navigate the trie. If the character does not exist, go to the dead node and return empty.3 sentences. With negative counts, the smallest values represent the highest frequencies. Extract up to 3 items and return them.class TrieNode:
def __init__(self):
self.children = {}
self.sentences = defaultdict(int)
class AutocompleteSystem:
def __init__(self, sentences: List[str], times: List[int]):
self.root = TrieNode()
for sentence, count in zip(sentences, times):
self.add_to_trie(sentence, count)
self.curr_sentence = []
self.curr_node = self.root
self.dead = TrieNode()
def input(self, c: str) -> List[str]:
if c == "#":
curr_sentence = "".join(self.curr_sentence)
self.add_to_trie(curr_sentence, 1)
self.curr_sentence = []
self.curr_node = self.root
return []
self.curr_sentence.append(c)
if c not in self.curr_node.children:
self.curr_node = self.dead
return []
self.curr_node = self.curr_node.children[c]
items = [(val, key) for key, val in self.curr_node.sentences.items()]
ans = heapq.nsmallest(3, items)
return [item[1] for item in ans]
def add_to_trie(self, sentence, count):
node = self.root
for c in sentence:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.sentences[sentence] -= countThis analysis will assume that you have access to a linear time heapify method, like in the Python implementation.
Time complexity:
constructor:
input:
currSentence and the trie, both cost . Next, we fetch the sentences in the current node. Initially, a node could hold sentences. After we call input times, we could add new sentences. Overall, there could be up to sentences. We heapify these sentences and find the best 3 in linear time, which costs .input is called times, which gives us a total of .Space complexity:
Where is the length of
sentences, is the average length of all sentences, and is the number of timesinputis called.
When the user presses #, you must add the completed sentence to the trie, reset the current sentence, and return the trie pointer to root. Missing any of these steps causes incorrect behavior on subsequent inputs.
# Wrong: forgets to reset current node
if c == "#":
self.add_to_trie(curr_sentence, 1)
self.curr_sentence = []
return [] # Missing: self.curr_node = self.root
# Correct
if c == "#":
self.add_to_trie("".join(self.curr_sentence), 1)
self.curr_sentence = []
self.curr_node = self.root
return []When a character doesn't exist in the current node's children, the system should return empty results for all subsequent characters until # is pressed. Using a "dead" node pattern handles this cleanly.
# Wrong: crashes on missing child
self.curr_node = self.curr_node.children[c] # KeyError
# Correct: use dead node for invalid paths
if c not in self.curr_node.children:
self.curr_node = self.dead # dead is empty TrieNode
return []
self.curr_node = self.curr_node.children[c]Results must be sorted by frequency (highest first), then alphabetically for ties. Reversing this order or sorting in the wrong direction gives incorrect top-3 results.
# Wrong: sorts alphabetically first, then by frequency
sorted_sentences = sorted(sentences.items(), key=lambda x: (x[0], -x[1]))
# Correct: frequency descending, then alphabetically
sorted_sentences = sorted(sentences.items(), key=lambda x: (-x[1], x[0]))Each trie node must store all sentences that pass through it, not just at the leaf. Otherwise, prefix queries won't find matching sentences.
# Wrong: only stores at final node
def add_to_trie(self, sentence, count):
node = self.root
for c in sentence:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.sentences[sentence] += count # Only at end
# Correct: store at every node along path
def add_to_trie(self, sentence, count):
node = self.root
for c in sentence:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.sentences[sentence] += count # At every nodeThe problem requires returning at most 3 results. Forgetting to limit the output or using the wrong slice gives incorrect answers.
# Wrong: returns all matching sentences
return [s[0] for s in sorted_sentences]
# Correct: return at most 3
return [s[0] for s in sorted_sentences[:3]]