You are given a string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings.
Return the minimum number of extra characters left over if you break up s optimally.
Note that the same word in the dictionary may be reused multiple times.
Example 1:
Input: s = "neetcodes", dictionary = ["neet","code","neetcode"]
Output: 1Explanation: The optimal way is to break s into two substrings: "neet" from index 0 to 3 and "code" from index 4 to 7. There is one character which is at index 8.
Example 2:
Input: s = "neetcodde", dictionary = ["neet","code","neetcode"]
Output: 5Explanation: The optimal way is to break s into one substring: "neet" from index 0 to 3. The characters at indices from 4 to 8 are extra.
Constraints:
1 <= s.length <= 501 <= dictionary.length <= 501 <= dictionary[i].length <= 50s and dictionary[i] consist of only lowercase English lettersdictionary contains distinct wordsAt each position in the string, we have two choices: either skip the current character (counting it as extra), or try to match a dictionary word starting at this position. If a word matches, we can jump past it without counting those characters as extra.
This recursive approach explores all possibilities. For each index, we take the minimum of skipping one character versus matching any dictionary word that starts at that index.
dfs(i) that returns the minimum extra characters from index i to the end:i == len(s), return 0 (no characters left).res = 1 + dfs(i + 1) (skip current character).j from i to len(s) - 1:s[i:j+1] is in the dictionary, update res = min(res, dfs(j + 1)).res.dfs(0).class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
words = set(dictionary)
def dfs(i):
if i == len(s):
return 0
res = 1 + dfs(i + 1)
for j in range(i, len(s)):
if s[i:j + 1] in words:
res = min(res, dfs(j + 1))
return res
return dfs(0)Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
The recursive solution has overlapping subproblems since we may compute the answer for the same index multiple times. Memoization stores results so each subproblem is solved only once.
Using a hash set for the dictionary allows efficient substring lookups. The memoization table dp[i] stores the minimum extra characters from index i onward.
dp with dp[len(s)] = 0 as the base case.dfs(i):i is already in dp, return dp[i].res = 1 + dfs(i + 1).j from i to len(s) - 1, if s[i:j+1] is in the set, update res = min(res, dfs(j + 1)).dp[i] = res.dfs(0).class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
words = set(dictionary)
dp = {len(s): 0}
def dfs(i):
if i in dp:
return dp[i]
res = 1 + dfs(i + 1)
for j in range(i, len(s)):
if s[i:j + 1] in words:
res = min(res, dfs(j + 1))
dp[i] = res
return res
return dfs(0)Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
Instead of recursion with memoization, we can fill the DP table iteratively from right to left. dp[i] represents the minimum extra characters when starting from index i. Since dp[i] depends on dp[j+1] for j >= i, processing from right to left ensures all dependencies are resolved.
dp of size n + 1, initialized to 0.i from n - 1 down to 0:dp[i] = 1 + dp[i + 1] (skip current character).j from i to n - 1, if s[i:j+1] is in the set, update dp[i] = min(dp[i], dp[j + 1]).dp[0].class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
words = set(dictionary)
n = len(s)
dp = [0] * (n + 1)
for i in range(n - 1, -1, -1):
dp[i] = 1 + dp[i + 1]
for j in range(i, n):
if s[i:j + 1] in words:
dp[i] = min(dp[i], dp[j + 1])
return dp[0]Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
Instead of checking all substrings against a hash set, we can iterate through dictionary words directly. For each position, we check if any dictionary word matches starting at that position. This can be faster when the dictionary is small relative to the string length.
dp with dp[len(s)] = 0.dfs(i):i is in dp, return dp[i].res = 1 + dfs(i + 1).word in the dictionary:i + len(word) <= len(s) and s[i:i+len(word)] == word, update res = min(res, dfs(i + len(word))).dp[i] = res.dfs(0).class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
dp = {len(s) : 0}
def dfs(i):
if i in dp:
return dp[i]
res = 1 + dfs(i + 1)
for word in dictionary:
if i + len(word) > len(s):
continue
flag = True
for j in range(len(word)):
if s[i + j] != word[j]:
flag = False
break
if flag:
res = min(res, dfs(i + len(word)))
dp[i] = res
return res
return dfs(0)Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
This is the iterative version of the previous approach, iterating through dictionary words at each position rather than checking all possible substrings. It avoids recursion overhead while maintaining the same logic.
dp of size n + 1, initialized to 0.i from n - 1 down to 0:dp[i] = 1 + dp[i + 1].word in the dictionary:i + len(word) <= n and s[i:i+len(word)] == word, update dp[i] = min(dp[i], dp[i + len(word)]).dp[0].class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
n = len(s)
dp = [0] * (n + 1)
for i in range(n - 1, -1, -1):
dp[i] = 1 + dp[i + 1]
for word in dictionary:
if i + len(word) <= n and s[i:i + len(word)] == word:
dp[i] = min(dp[i], dp[i + len(word)])
return dp[0]Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
A Trie (prefix tree) allows efficient prefix matching. Instead of checking each dictionary word independently, we traverse the Trie character by character. If we hit a dead end (no child for the current character), we stop early. This is faster than checking each word when there are many dictionary words sharing common prefixes.
dp with dp[len(s)] = 0.dfs(i):i is in dp, return dp[i].res = 1 + dfs(i + 1).root:j from i to len(s) - 1:s[j] is not a child of the current node, break.res = min(res, dfs(j + 1)).dp[i] = res.dfs(0).class TrieNode:
def __init__(self):
self.children = {}
self.isWord = False
class Trie:
def __init__(self):
self.root = TrieNode()
def addWord(self, word):
curr = self.root
for c in word:
if c not in curr.children:
curr.children[c] = TrieNode()
curr = curr.children[c]
curr.isWord = True
class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
trie = Trie()
for w in dictionary:
trie.addWord(w)
dp = {len(s): 0}
def dfs(i):
if i in dp:
return dp[i]
res = 1 + dfs(i + 1)
curr = trie.root
for j in range(i, len(s)):
if s[j] not in curr.children:
break
curr = curr.children[s[j]]
if curr.isWord:
res = min(res, dfs(j + 1))
dp[i] = res
return res
return dfs(0)Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
This combines the bottom-up DP approach with Trie-based matching. We iterate from right to left, and at each position, we traverse the Trie to find all dictionary words that start at that position. The Trie allows early termination when no dictionary word can match the current prefix.
dp of size n + 1, initialized to 0.i from n - 1 down to 0:dp[i] = 1 + dp[i + 1].root and traverse:j from i to n - 1:s[j] is not a child, break.dp[i] = min(dp[i], dp[j + 1]).dp[0].class TrieNode:
def __init__(self):
self.children = {}
self.isWord = False
class Trie:
def __init__(self):
self.root = TrieNode()
def addWord(self, word):
curr = self.root
for c in word:
if c not in curr.children:
curr.children[c] = TrieNode()
curr = curr.children[c]
curr.isWord = True
class Solution:
def minExtraChar(self, s: str, dictionary: List[str]) -> int:
trie = Trie()
for w in dictionary:
trie.addWord(w)
n = len(s)
dp = [0] * (n + 1)
for i in range(n - 1, -1, -1):
dp[i] = 1 + dp[i + 1]
curr = trie.root
for j in range(i, n):
if s[j] not in curr.children:
break
curr = curr.children[s[j]]
if curr.isWord:
dp[i] = min(dp[i], dp[j + 1])
return dp[0]Where is the length of the string , is the number of words in the dictionary, and is the average length of a word in the dictionary.
When checking if s[i:j+1] matches a dictionary word, getting the indices wrong is a common mistake. In Python, s[i:j] excludes index j, so you need s[i:j+1] to include the character at position j. Similarly, when jumping to the next position after a match, you should move to j+1, not j.
At each position, you must always consider the option of skipping the current character (counting it as extra) with cost 1 + dp[i+1]. If you only check dictionary matches without this fallback, positions where no dictionary word starts will cause incorrect results or infinite loops.
The naive approach of checking all possible substrings against the dictionary for every position leads to exponential time complexity. Without memoization, the same subproblems are solved repeatedly. For large inputs, either use dynamic programming with memoization or build a Trie to efficiently find all dictionary words starting at each position.