10. Regular Expression Matching - Explanation

Problem Link

Description

You are given an input string s consisting of lowercase english letters, and a pattern p consisting of lowercase english letters, as well as '.', and '*' characters.

Return true if the pattern matches the entire input string, otherwise return false.

  • '.' Matches any single character
  • '*' Matches zero or more of the preceding element.

Example 1:

Input: s = "aa", p = ".b"

Output: false

Explanation: Regardless of which character we choose for the '.' in the pattern, we cannot match the second character in the input string.

Example 2:

Input: s = "nnn", p = "n*"

Output: true

Explanation: '*' means zero or more of the preceding element, 'n'. We choose 'n' to repeat three times.

Example 3:

Input: s = "xyz", p = ".*z"

Output: true

Explanation: The pattern ".*" means zero or more of any character, so we choose ".." to match "xy" and "z" to match "z".

Constraints:

  • 1 <= s.length <= 20
  • 1 <= p.length <= 20
  • Each appearance of '*', will be preceded by a valid character or '.'.


Topics

Recommended Time & Space Complexity

You should aim for a solution as good or better than O(m * n) time and O(m * n) space, where m is the length of the string s and n is the length of the string p.


Hint 1

Try to think in terms of recursion and visualize it as a decision tree, where we explore different combinations to match the strings when encountering *. Multiple decisions are made at each step to find a valid matching path. Can you determine the possible decisions at each recursion step?


Hint 2

We recursively iterate through the strings using indices i and j for s and p, respectively. If the characters match or p[j] is '.', we increment both i and j to process the remaining strings. When the next character of string p is '*', we have two choices: skip it (treating it as zero occurrences) or match one or more characters (if s[i] matches p[j]), incrementing i accordingly.


Hint 3

If both indices go out of bounds, we return true; otherwise, we return false. If any recursive path returns true, we immediately return true. This approach is exponential. Can you think of a way to optimize it?


Hint 4

We can use memoization to cache the results of recursive calls and avoid redundant calculations. A hash map or a 2D array can be used to store these results.


Company Tags

Please upgrade to NeetCode Pro to view company tags.



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Recursion - Breaking down the matching problem into smaller subproblems with base cases
  • Dynamic Programming (Memoization) - Caching results to avoid redundant computations in overlapping subproblems
  • Dynamic Programming (Tabulation) - Building solutions bottom-up using a 2D table

1. Recursion

Intuition

We need to check if the string s matches the pattern p, where:

  • . matches any single character
  • * means "zero or more of the previous character"

A recursive approach works because at each step we decide how to match the current part of the pattern with the current part of the string.

The function dfs(i, j) represents:
"Can s[i:] be matched by p[j:]?"

There are two main cases:

  1. The next pattern character is NOT *
    • then the current characters must match (directly or via .), and we move both pointers forward.
  2. The next pattern character IS *
    • then we have two choices:
      • skip the x* part entirely (use * as zero occurrences)
      • use the x* part to match one character from the string (if it matches), and stay on the same pattern index to potentially match more

Algorithm

  1. Let m = len(s) and n = len(p).
  2. Define a recursive function dfs(i, j):
    • i is the current index in s
    • j is the current index in p
  3. If j reaches the end of the pattern:
    • Return true only if i also reached the end of the string
  4. Compute whether the current characters match:
    • match is true if i is in bounds and (s[i] == p[j] or p[j] == '.')
  5. If the next character in the pattern is * (i.e., p[j + 1] == '*'):
    • Option 1: Skip p[j] and *dfs(i, j + 2)
    • Option 2: If match is true, consume one character from s and keep pattern at jdfs(i + 1, j)
    • Return true if either option works
  6. Otherwise (no * case):
    • If match is true, move both pointers forward → dfs(i + 1, j + 1)
    • If not, return false
  7. Start with dfs(0, 0) and return the result
class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        m, n = len(s), len(p)

        def dfs(i, j):
            if j == n:
                return i == m

            match = i < m and (s[i] == p[j] or p[j] == ".")
            if (j + 1) < n and p[j + 1] == "*":
                return (dfs(i, j + 2) or          # don't use *
                       (match and dfs(i + 1, j))) # use *
            if match:
                return dfs(i + 1, j + 1)
            return False

        return dfs(0, 0)

Time & Space Complexity

  • Time complexity: O(2m+n)O(2 ^ {m + n})
  • Space complexity: O(m+n)O(m + n)

Where mm is the length of the string ss and nn is the length of the string pp.


2. Dynamic Programming (Top-Down)

Intuition

We need to check if s matches pattern p, where:

  • . matches any single character
  • * means "zero or more of the previous character"

A recursive solution explores all matching possibilities, especially because * can represent many choices.
However, the same (i, j) states get recomputed multiple times, making plain recursion slow.

So we use top-down dynamic programming (memoization).

We define dfs(i, j) as:
"Can the substring s[i:] match the pattern p[j:]?"

Whenever we compute the answer for a state (i, j), we store it in a cache so future calls can reuse it instantly.

Algorithm

  1. Let m = len(s) and n = len(p).
  2. Create a cache (map) to store results for states (i, j).
  3. Define a recursive function dfs(i, j):
    • i is the current index in s
    • j is the current index in p
  4. If j reaches the end of the pattern:
    • Return true only if i also reaches the end of the string
  5. If (i, j) is already in the cache:
    • Return the cached result
  6. Check if the current characters match:
    • match is true if i < m and (s[i] == p[j] or p[j] == '.')
  7. If the next character in the pattern is * (p[j + 1] == '*'):
    • Option 1: Skip the x* part completely → dfs(i, j + 2)
    • Option 2: If match is true, consume one char from s and stay on jdfs(i + 1, j)
    • Store and return whether either option works
  8. Otherwise (no * case):
    • If match is true, move both pointers forward → dfs(i + 1, j + 1)
    • Otherwise, return false
  9. Store the computed result in the cache before returning it
  10. Start with dfs(0, 0) and return the final answer
class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        m, n = len(s), len(p)
        cache = {}

        def dfs(i, j):
            if j == n:
                return i == m
            if (i, j) in cache:
                return cache[(i, j)]

            match = i < m and (s[i] == p[j] or p[j] == ".")
            if (j + 1) < n and p[j + 1] == "*":
                cache[(i, j)] = (dfs(i, j + 2) or
                                (match and dfs(i + 1, j)))
                return cache[(i, j)]

            if match:
                cache[(i, j)] = dfs(i + 1, j + 1)
                return cache[(i, j)]

            cache[(i, j)] = False
            return False

        return dfs(0, 0)

Time & Space Complexity

  • Time complexity: O(mn)O(m * n)
  • Space complexity: O(mn)O(m * n)

Where mm is the length of the string ss and nn is the length of the string pp.


3. Dynamic Programming (Bottom-Up)

Intuition

We want to check whether the string s matches the pattern p, where:

  • . matches any single character
  • * means "zero or more of the previous character"

Instead of recursion, we can solve this using bottom-up dynamic programming by building answers for smaller suffixes of the strings.

We define a DP state that answers:
"Can the substring s[i:] be matched by the pattern p[j:]?"

By filling a table from the end of both strings toward the beginning, we ensure that when we compute a state, all the states it depends on are already known.

Algorithm

  1. Create a 2D boolean DP table dp of size
    (len(s) + 1) × (len(p) + 1).
  2. Let dp[i][j] represent whether s[i:] matches p[j:].
  3. Initialize the base case:
    • dp[len(s)][len(p)] = true because two empty strings match
  4. Fill the table from bottom to top and right to left:
  5. For each position (i, j):
    • First check if the current characters match:
      • match is true if i < len(s) and (s[i] == p[j] or p[j] == '.')
  6. If the next character in the pattern is *:
    • Option 1: Treat x* as zero occurrences → dp[i][j] = dp[i][j + 2]
    • Option 2: If match is true, consume one character from s and stay on the same pattern → dp[i + 1][j]
    • Set dp[i][j] to true if either option works
  7. If the next character is not *:
    • If match is true, move both pointers forward:
      • dp[i][j] = dp[i + 1][j + 1]
  8. After filling the table, the final answer is stored in dp[0][0]
  9. Return dp[0][0]
class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        dp = [[False] * (len(p) + 1) for i in range(len(s) + 1)]
        dp[len(s)][len(p)] = True

        for i in range(len(s), -1, -1):
            for j in range(len(p) - 1, -1, -1):
                match = i < len(s) and (s[i] == p[j] or p[j] == ".")

                if (j + 1) < len(p) and p[j + 1] == "*":
                    dp[i][j] = dp[i][j + 2]
                    if match:
                        dp[i][j] = dp[i + 1][j] or dp[i][j]
                elif match:
                    dp[i][j] = dp[i + 1][j + 1]

        return dp[0][0]

Time & Space Complexity

  • Time complexity: O(mn)O(m * n)
  • Space complexity: O(mn)O(m * n)

Where mm is the length of the string ss and nn is the length of the string pp.


4. Dynamic Programming (Space Optimized)

Intuition

We need to determine if string s matches pattern p, where:

  • . matches any single character
  • * means "zero or more of the previous character"

In the bottom-up DP solution, we used a 2D table dp[i][j] meaning:
"Does s[i:] match p[j:]?"

But each row i only depends on:

  • the next row (i + 1)
  • values within the current row while moving across j

So we don't need the full 2D table. We can compress it into:

  • dp → represents the row for i + 1
  • nextDp → represents the row for i

This keeps the same logic while using much less memory.

Algorithm

  1. Create a 1D boolean array dp of size len(p) + 1:
    • it represents results for matching s[i+1:] with p[j:]
  2. Initialize the base case:
    • dp[len(p)] = true (empty pattern matches empty string at the very end)
  3. Iterate i from len(s) down to 0:
    • Create a new array nextDp for the current row (i)
    • Set nextDp[len(p)] = (i == len(s))
      • empty pattern only matches if we are also at the end of the string
  4. For each i, iterate j from len(p) - 1 down to 0:
    • Compute match:
      • true if i < len(s) and (s[i] == p[j] or p[j] == '.')
    • If the next pattern character is *:
      • Option 1: skip x* (zero occurrences) → nextDp[j + 2]
      • Option 2: if match, consume one char from s and stay on jdp[j]
      • Combine both options to set nextDp[j]
    • Otherwise (no *):
      • if match, move both forward → nextDp[j] = dp[j + 1]
  5. After finishing the row, set dp = nextDp
  6. The final answer is dp[0] (matching s[0:] with p[0:])
  7. Return dp[0]
class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        dp = [False] * (len(p) + 1)
        dp[len(p)] = True

        for i in range(len(s), -1, -1):
            nextDp = [False] * (len(p) + 1)
            nextDp[len(p)] = (i == len(s))

            for j in range(len(p) - 1, -1, -1):
                match = i < len(s) and (s[i] == p[j] or p[j] == ".")

                if (j + 1) < len(p) and p[j + 1] == "*":
                    nextDp[j] = nextDp[j + 2]
                    if match:
                        nextDp[j] |= dp[j]
                elif match:
                    nextDp[j] = dp[j + 1]

            dp = nextDp

        return dp[0]

Time & Space Complexity

  • Time complexity: O(mn)O(m * n)
  • Space complexity: O(n)O(n)

Where mm is the length of the string ss and nn is the length of the string pp.


5. Dynamic Programming (Optimal)

Intuition

We want to check if s matches p, where:

  • . matches any single character
  • * means "zero or more of the previous character"

The usual bottom-up DP state is:
dp[i][j] = does s[i:] match p[j:]?

A 2D table works, and a 1D array also works if we update carefully.
This "optimal" version goes one step further: it updates the 1D array in-place without creating a second array.

The main challenge with in-place updates is that some transitions need the diagonal value:

  • dp[i + 1][j + 1] (move both forward)

When we compress to 1D, that diagonal value would get overwritten while we move left through j.
To handle this, we keep one extra variable (dp1) that stores the previous diagonal value as we update the row.

So at each (i, j) we can still access:

  • dp[j] → old value for dp[i + 1][j]
  • dp[j + 2] → current row value for skipping x*
  • dp1 → old diagonal dp[i + 1][j + 1]

Algorithm

  1. Create a 1D boolean array dp of size len(p) + 1:
    • dp[j] represents whether s[i + 1:] matches p[j:] for the current outer loop
  2. Initialize the base case:
    • dp[len(p)] = true (empty pattern matches empty string at the end)
  3. Iterate i from len(s) down to 0:
    • Save the old end value in dp1 (this will act as the diagonal when j moves left)
    • Update dp[len(p)] = (i == len(s)) since empty pattern only matches if string is also empty
  4. Iterate j from len(p) - 1 down to 0:
    • Compute match:
      • true if i < len(s) and (s[i] == p[j] or p[j] == '.')
    • If the next pattern character is *:
      • Option 1: skip x* → use dp[j + 2]
      • Option 2: if match, consume one char from s and stay on j → use dp[j]
      • Combine both to get the result for dp[j]
    • Otherwise (no *):
      • If match, move both forward → result is the diagonal dp1
    • Update dp[j] with the computed result
    • Shift the diagonal tracker:
      • set dp1 to the old dp[j] value before it was overwritten
  5. After all updates, dp[0] represents whether s[0:] matches p[0:]
  6. Return dp[0]
class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        dp = [False] * (len(p) + 1)
        dp[len(p)] = True

        for i in range(len(s), -1, -1):
            dp1 = dp[len(p)]
            dp[len(p)] = (i == len(s))

            for j in range(len(p) - 1, -1, -1):
                match = i < len(s) and (s[i] == p[j] or p[j] == ".")
                res = False
                if (j + 1) < len(p) and p[j + 1] == "*":
                    res = dp[j + 2]
                    if match:
                        res |= dp[j]
                elif match:
                    res = dp1
                dp[j], dp1 = res, dp[j]

        return dp[0]

Time & Space Complexity

  • Time complexity: O(mn)O(m * n)
  • Space complexity: O(n)O(n)

Where mm is the length of the string ss and nn is the length of the string pp.


Common Pitfalls

Misunderstanding How Star Works

The * character means "zero or more of the preceding element," not "zero or more of any character." A common mistake is treating a* as matching any sequence of characters. In reality, a* only matches zero or more 'a' characters. Similarly, .* matches zero or more of any single character (because . matches any character).

Forgetting That Star Can Match Zero Occurrences

When encountering x* in the pattern, many solutions only consider the case where x matches at least one character. However, x* can also match zero characters, meaning the entire x* portion can be skipped. Both branches must be explored: skip x* entirely or consume a matching character while staying on the same pattern position.

Off-by-One Errors When Checking for Star

The pattern must be checked carefully to see if the next character is *. A common bug is checking p[j] instead of p[j+1] for the star, or failing to verify that j+1 is within bounds before accessing it. This leads to index out of bounds errors or incorrect pattern matching logic.

Not Handling Empty String or Pattern Edge Cases

The base cases require careful handling. When the pattern is exhausted (j == n), the match is valid only if the string is also exhausted (i == m). However, when the string is exhausted but the pattern is not, matching can still succeed if the remaining pattern consists entirely of x* pairs. Failing to account for patterns like a*b*c* matching an empty string is a common oversight.

Incorrect DP State Transitions

In the bottom-up DP approach, the iteration order matters. Processing from the end of both strings toward the beginning ensures that required subproblem solutions are available. A common mistake is iterating in the wrong direction or incorrectly referencing dp[i+1][j+1] when it has not been computed yet.