Given two strings text1 and text2, return the length of the longest common subsequence between the two strings if one exists, otherwise return 0.
A subsequence is a sequence that can be derived from the given sequence by deleting some or no elements without changing the relative order of the remaining characters.
"cat" is a subsequence of "crabt".A common subsequence of two strings is a subsequence that exists in both strings.
Example 1:
Input: text1 = "cat", text2 = "crabt"
Output: 3 Explanation: The longest common subsequence is "cat" which has a length of 3.
Example 2:
Input: text1 = "abcd", text2 = "abcd"
Output: 4Example 3:
Input: text1 = "abcd", text2 = "efgh"
Output: 0Constraints:
1 <= text1.length, text2.length <= 1000text1 and text2 consist of only lowercase English characters.
You should aim for a solution as good or better than O(m * n) time and O(m * n) space, where m is the length of the string text1 and n is the length of the string text2.
Try to think in terms of recursion and visualize it as a decision tree. Can you determine the possible decisions at each step? Maybe you should consider iterating through both strings recursively at the same time.
At each recursion step, we have two choices: if the characters at the current indices of both strings match, we move both indices forward and extend the subsequence. Otherwise, we explore two paths by advancing one index at a time and recursively finding the longest subsequence. We return the maximum length between these two choices. This approach is exponential. Can you think of a way to optimize it?
We return 0 if either index goes out of bounds. To optimize, we can use memoization to cache recursive call results and avoid redundant calculations. A hash map or a 2D array can be used to store these results.
Before attempting this problem, you should be comfortable with:
A subsequence is a sequence derived by deleting some or no characters without changing the order of the remaining elements. To find the longest common subsequence (LCS) of two strings, we compare characters one by one. If the current characters match, they contribute to the LCS, and we move both pointers forward. If they don't match, we try skipping a character from either string and take the best result. This naturally leads to a recursive approach that explores all possibilities.
dfs(i, j) where i and j are the current indices in text1 and text2.0.text1[i] == text2[j], include this character in the LCS and recurse with i + 1 and j + 1.text1[i] or skip text2[j], and return the maximum.dfs(0, 0).Where is the length of the string and is the length of the string .
The recursive solution recalculates the same subproblems many times. For example, dfs(2, 3) might be called from multiple branches. By storing results in a memo table, we avoid redundant work. This transforms the exponential solution into a polynomial one.
(i, j).dfs(i, j), check if the result is already cached.class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
memo = {}
def dfs(i, j):
if i == len(text1) or j == len(text2):
return 0
if (i, j) in memo:
return memo[(i, j)]
if text1[i] == text2[j]:
memo[(i, j)] = 1 + dfs(i + 1, j + 1)
else:
memo[(i, j)] = max(dfs(i + 1, j), dfs(i, j + 1))
return memo[(i, j)]
return dfs(0, 0)Where is the length of the string and is the length of the string .
Instead of starting from the beginning and recursing forward, we can fill a 2D table iteratively from the end. The value dp[i][j] represents the LCS length for substrings text1[i:] and text2[j:]. By processing indices in reverse order, we ensure that when we compute dp[i][j], the values we depend on (dp[i+1][j+1], dp[i+1][j], dp[i][j+1]) are already computed.
dp of size (m+1) x (n+1) initialized to 0.i from m-1 down to 0, and j from n-1 down to 0.text1[i] == text2[j], set dp[i][j] = 1 + dp[i+1][j+1].dp[i][j] = max(dp[i+1][j], dp[i][j+1]).dp[0][0].class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
dp = [[0 for j in range(len(text2) + 1)]
for i in range(len(text1) + 1)]
for i in range(len(text1) - 1, -1, -1):
for j in range(len(text2) - 1, -1, -1):
if text1[i] == text2[j]:
dp[i][j] = 1 + dp[i + 1][j + 1]
else:
dp[i][j] = max(dp[i][j + 1], dp[i + 1][j])
return dp[0][0]Where is the length of the string and is the length of the string .
Looking at the bottom-up recurrence, each cell dp[i][j] only depends on the current row and the next row. We don't need the entire 2D table; two 1D arrays suffice. We keep a prev array for the next row and a curr array for the current row, swapping them after each row is processed.
text1 is shorter, swap the strings to minimize space usage.prev and curr of size n+1.i from m-1 down to 0:j from n-1 down to 0, compute curr[j] using prev[j+1], prev[j], and curr[j+1].prev and curr.prev[0].class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
if len(text1) < len(text2):
text1, text2 = text2, text1
prev = [0] * (len(text2) + 1)
curr = [0] * (len(text2) + 1)
for i in range(len(text1) - 1, -1, -1):
for j in range(len(text2) - 1, -1, -1):
if text1[i] == text2[j]:
curr[j] = 1 + prev[j + 1]
else:
curr[j] = max(curr[j + 1], prev[j])
prev, curr = curr, prev
return prev[0]Where is the length of the string and is the length of the string .
We can reduce space further by using a single array and a temporary variable. When iterating right to left within a row, we need the old value at position j (which becomes dp[i+1][j] in 2D terms) before overwriting it. We store this in a variable prev before the update, then use it for the diagonal reference in the next iteration.
text1 is shorter, swap strings for minimal space.dp of size n+1.i from m-1 down to 0:prev = 0 (represents dp[i+1][n]).j from n-1 down to 0:temp = dp[j] (the old value before update).dp[j] = 1 + prev.dp[j] = max(dp[j], dp[j+1]).prev = temp.dp[0].class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
if len(text1) < len(text2):
text1, text2 = text2, text1
dp = [0] * (len(text2) + 1)
for i in range(len(text1) - 1, -1, -1):
prev = 0
for j in range(len(text2) - 1, -1, -1):
temp = dp[j]
if text1[i] == text2[j]:
dp[j] = 1 + prev
else:
dp[j] = max(dp[j], dp[j + 1])
prev = temp
return dp[0]Where is the length of the string and is the length of the string .
A subsequence does not require consecutive characters, whereas a substring does. A common mistake is to reset the count when characters do not match, which would find the longest common substring instead of subsequence. In LCS, when characters do not match, you take the maximum of skipping either character.
When using a 2D DP table, the dimensions should be (m+1) x (n+1) to account for the base case of empty strings. Accessing dp[i+1][j+1] when characters match requires careful attention to avoid index out of bounds errors.
In the bottom-up approach, iterating in the wrong direction can cause the algorithm to use uncomputed values. When processing from the end of the strings backward, ensure that dp[i][j] is computed after dp[i+1][j+1], dp[i+1][j], and dp[i][j+1] are already available.