1898. Maximum Number of Removable Characters - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Binary Search - Efficiently searching a sorted or monotonic search space by halving the range
  • Subsequence Checking (Two Pointers) - Verifying if one string is a subsequence of another using two pointers
  • Hash Set - Tracking removed indices for O(1) lookup during subsequence validation

1. Brute Force

Intuition

We want to find the maximum number of characters we can remove from s (in the order given by removable) while keeping p as a subsequence of s.

The simplest approach is to remove characters one at a time. After each removal, check if p is still a subsequence. Once p is no longer a subsequence, stop and return the count of successful removals.

Algorithm

  1. Maintain a set marked of indices that have been removed.
  2. Iterate through removable:
    • Add the current index to marked.
    • Check if p is a subsequence of s (skipping marked indices).
    • If p is still a subsequence, increment the result counter.
    • If not, break out of the loop.
  3. Return the count of successful removals.
class Solution:
    def maximumRemovals(self, s: str, p: str, removable: List[int]) -> int:
        n, m = len(s), len(p)
        marked = set()
        res = 0

        for removeIdx in removable:
            marked.add(removeIdx)

            sIdx = pIdx = 0
            while pIdx < m and sIdx < n:
                if sIdx not in marked and s[sIdx] == p[pIdx]:
                    pIdx += 1
                sIdx += 1
            if pIdx != m:
                break
            res += 1

        return res

Time & Space Complexity

  • Time complexity: O(k(n+m))O(k * (n + m))
  • Space complexity: O(k)O(k)

Where nn and mm are the lengths of the given strings ss and pp respectively. kk is the size of the array removableremovable.


2. Binary Search + Hash Set

Intuition

If removing the first k characters still allows p to be a subsequence, then removing fewer than k characters will also work. Conversely, if removing k characters breaks the subsequence property, removing more will certainly break it too.

This monotonic property makes binary search applicable. We search for the largest k such that after removing removable[0..k], p remains a subsequence.

Algorithm

  1. Binary search over the number of removals k (range: 0 to length of removable - 1).
  2. For each midpoint m:
    • Create a set of removed indices from removable[0..m].
    • Check if p is a subsequence of s (skipping removed indices).
    • If yes, search the right half (try more removals).
    • If no, search the left half (try fewer removals).
  3. Track the maximum valid k found.
  4. Return the result.
class Solution:
    def maximumRemovals(self, s: str, p: str, removable: List[int]) -> int:
        def isSubseq(s, subseq, removed):
            i1 = i2 = 0
            while i1 < len(s) and i2 < len(subseq):
                if i1 in removed or s[i1] != subseq[i2]:
                    i1 += 1
                    continue
                i1 += 1
                i2 += 1
            return i2 == len(subseq)

        res = 0
        l, r = 0, len(removable) - 1
        while l <= r:
            m = (l + r) // 2
            removed = set(removable[:m + 1])
            if isSubseq(s, p, removed):
                res = max(res, m + 1)
                l = m + 1
            else:
                r = m - 1

        return res

Time & Space Complexity

  • Time complexity: O((n+m)logk)O((n + m) * \log k)
  • Space complexity: O(k)O(k)

Where nn and mm are the lengths of the given strings ss and pp respectively. kk is the size of the array removableremovable.


Intuition

Instead of using a hash set to track removed indices (which adds overhead), we can directly modify the string by replacing removed characters with a placeholder character (like '#'). This avoids hash lookups during the subsequence check.

The binary search logic remains the same, but the subsequence check becomes simpler since we just compare characters, skipping any '#' naturally by checking for equality.

Algorithm

  1. Binary search over the number of removals with l = 0 and r = length of removable.
  2. For each midpoint mid:
    • Create a copy of s as a character array.
    • Mark positions removable[0..mid] as '#'.
    • Check if p is a subsequence (characters equal to '#' will never match).
    • If yes, move l = mid + 1.
    • If no, move r = mid.
  3. Return l as the maximum number of removals.
class Solution:
    def maximumRemovals(self, s: str, p: str, removable: List[int]) -> int:
        l, r = 0, len(removable)
        n, m = len(s), len(p)

        def isSubseq(tmpS):
            i1 = i2 = 0
            while i1 < n and i2 < m:
                if tmpS[i1] == p[i2]:
                    i2 += 1
                i1 += 1
            return i2 == m

        while l < r:
            mid = l + (r - l) // 2
            tmpS = list(s)

            for i in range(mid + 1):
                tmpS[removable[i]] = '#'

            if isSubseq(tmpS):
                l = mid + 1
            else:
                r = mid

        return l

Time & Space Complexity

  • Time complexity: O((n+m)logk)O((n + m) * \log k)
  • Space complexity: O(n)O(n)

Where nn and mm are the lengths of the given strings ss and pp respectively. kk is the size of the array removableremovable.


Common Pitfalls

Off-by-One Errors in Binary Search Bounds

When setting up binary search, a common mistake is using incorrect bounds or midpoint calculations. The search space is 0 to len(removable), not 0 to len(removable) - 1, because we need to consider removing zero characters as a valid option. Additionally, confusing l < r vs l <= r loop conditions leads to either infinite loops or missing valid answers.

Modifying the Original String Instead of Using a Copy

When marking characters as removed (using '#' or similar), forgetting to create a fresh copy of the string for each binary search iteration causes state to persist incorrectly. The removed characters from a previous iteration pollute subsequent checks, leading to wrong results.

Incorrect Subsequence Check Logic

The subsequence check must skip removed indices while still correctly advancing through both strings. A common bug is incrementing the pattern pointer even when the current character is marked as removed, or failing to increment the source pointer when characters do not match. Both lead to incorrect subsequence determinations.