1. Trie

class TrieNode:
    def __init__(self):
        self.children = {}
        self.sentences = defaultdict(int)

class AutocompleteSystem:
    def __init__(self, sentences: List[str], times: List[int]):
        self.root = TrieNode()
        for sentence, count in zip(sentences, times):
            self.add_to_trie(sentence, count)
            
        self.curr_sentence = []
        self.curr_node = self.root
        self.dead = TrieNode()
        
    def input(self, c: str) -> List[str]:
        if c == "#":
            curr_sentence = "".join(self.curr_sentence)
            self.add_to_trie(curr_sentence, 1)
            self.curr_sentence = []
            self.curr_node = self.root
            return []
        
        self.curr_sentence.append(c)
        if c not in self.curr_node.children:
            self.curr_node = self.dead
            return []
        
        self.curr_node = self.curr_node.children[c]
        sentences = self.curr_node.sentences
        sorted_sentences = sorted(sentences.items(), key = lambda x: (-x[1], x[0]))
        
        ans = []
        for i in range(min(3, len(sorted_sentences))):
            ans.append(sorted_sentences[i][0])
        
        return ans

    def add_to_trie(self, sentence, count):
        node = self.root
        for c in sentence:
            if c not in node.children:
                node.children[c] = TrieNode()
            node = node.children[c]
            node.sentences[sentence] += count

Time & Space Complexity

  • Time complexity: O(nk+m(n+mk)log(n+mk))O(n \cdot k + m \cdot (n + \frac{m}{k}) \cdot \log(n + \frac{m}{k}))

    constructor:

    • We initialize the trie, which costs O(nk)O(n \cdot k) as we iterate over each character in each sentence.

    input:

    • We add a character to currSentence and the trie, both cost O(1)O(1). Next, we fetch and sort the sentences in the current node. Initially, a node could hold O(n)O(n) sentences. After we call input mm times, we could add mk\frac{m}{k} new sentences. Overall, there could be up to O(n+mk)O(n + \frac{m}{k}) sentences, so a sort would cost O((n+mk)log(n+mk))O((n + \frac{m}{k}) \cdot \log(n + \frac{m}{k})).
    • The work done in the other cases (like adding a new sentence to the trie) will be dominated by this sort.
    • input is called mm times, which gives us a total of O(m(n+mk)log(n+mk))O(m \cdot (n + \frac{m}{k}) \cdot \log(n + \frac{m}{k}))
  • Space complexity: O(k(nk+m))O(k \cdot (n \cdot k + m))

Where nn is the length of sentences, kk is the average length of all sentences, and mm is the number of times input is called.


2. Optimize with Heap

class TrieNode:
    def __init__(self):
        self.children = {}
        self.sentences = defaultdict(int)

class AutocompleteSystem:
    def __init__(self, sentences: List[str], times: List[int]):
        self.root = TrieNode()
        for sentence, count in zip(sentences, times):
            self.add_to_trie(sentence, count)
            
        self.curr_sentence = []
        self.curr_node = self.root
        self.dead = TrieNode()
        
    def input(self, c: str) -> List[str]:
        if c == "#":
            curr_sentence = "".join(self.curr_sentence)
            self.add_to_trie(curr_sentence, 1)
            self.curr_sentence = []
            self.curr_node = self.root
            return []
        
        self.curr_sentence.append(c)
        if c not in self.curr_node.children:
            self.curr_node = self.dead
            return []
        
        self.curr_node = self.curr_node.children[c]
        items = [(val, key) for key, val in self.curr_node.sentences.items()]
        ans = heapq.nsmallest(3, items)
        return [item[1] for item in ans]

    def add_to_trie(self, sentence, count):
        node = self.root
        for c in sentence:
            if c not in node.children:
                node.children[c] = TrieNode()
            node = node.children[c]
            node.sentences[sentence] -= count

Time & Space Complexity

This analysis will assume that you have access to a linear time heapify method, like in the Python implementation.

  • Time complexity: O(nk+m(n+mk))O(n \cdot k + m \cdot (n + \frac{m}{k}))

    constructor:

    • We initialize the trie, which costs O(nk)O(n \cdot k) as we iterate over each character in each sentence.

    input:

    • We add a character to currSentence and the trie, both cost O(1)O(1). Next, we fetch the sentences in the current node. Initially, a node could hold O(n)O(n) sentences. After we call input mm times, we could add mk\frac{m}{k} new sentences. Overall, there could be up to O(n+mk)O(n + \frac{m}{k}) sentences. We heapify these sentences and find the best 3 in linear time, which costs O(n+mk)O(n + \frac{m}{k}).
    • The work done in the other cases (like adding a new sentence to the trie) will be dominated by this.
    • input is called mm times, which gives us a total of O(m(n+mk))O(m \cdot (n + \frac{m}{k})).
  • Space complexity: O(k(nk+m))O(k \cdot (n \cdot k + m))

Where nn is the length of sentences, kk is the average length of all sentences, and mm is the number of times input is called.