474. Ones and Zeroes - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Recursion - Understanding how to break problems into smaller subproblems and define base cases
  • Dynamic Programming (0/1 Knapsack) - This problem is a variant of the classic knapsack with two constraints instead of one
  • Memoization - Caching results of subproblems to avoid redundant computations
  • Multidimensional DP - Working with 2D and 3D DP tables for problems with multiple state variables

1. Recursion

Intuition

This problem is a variant of the 0/1 knapsack problem with two constraints instead of one. For each binary string, we must decide whether to include it in our subset or not.

We can try all possible combinations by exploring two branches at each string: include it (if we have enough zeros and ones remaining) or skip it. The goal is to maximize the count of strings we can include while staying within the budget of m zeros and n ones.

Algorithm

  1. Preprocess each string to count its zeros and ones, storing in an array arr.
  2. Define a recursive function dfs(i, m, n) that returns the maximum strings we can select starting from index i with m zeros and n ones remaining.
  3. Base case: If i reaches the end of the array, return 0.
  4. At each index, we have two choices:
    • Skip the current string: dfs(i + 1, m, n).
    • Include the current string (if affordable): 1 + dfs(i + 1, m - zeros, n - ones).
  5. Return the maximum of both choices.
class Solution:
    def findMaxForm(self, strs: List[str], m: int, n: int) -> int:
        arr = [[0] * 2 for _ in range(len(strs))]
        for i, s in enumerate(strs):
            for c in s:
                arr[i][ord(c) - ord('0')] += 1

        def dfs(i, m, n):
            if i == len(strs):
                return 0

            res = dfs(i + 1, m, n)
            if m >= arr[i][0] and n >= arr[i][1]:
                res = max(res, 1 + dfs(i + 1, m - arr[i][0], n - arr[i][1]))
            return res

        return dfs(0, m, n)

Time & Space Complexity

  • Time complexity: O(2N)O(2 ^ N)
  • Space complexity: O(N)O(N) for recursion stack.

Where NN represents the number of binary strings, and mm and nn are the maximum allowable counts of zeros and ones, respectively.


2. Dynamic Programming (Top-Down)

Intuition

The recursive solution has overlapping subproblems. The same state (i, m, n) can be reached through different paths, leading to redundant computations.

We add memoization to cache results for each unique state. The state is defined by three variables: the current string index, remaining zeros budget, and remaining ones budget. Once we compute the answer for a state, we store it and return immediately on future calls.

Algorithm

  1. Preprocess each string to count its zeros and ones.
  2. Create a 3D memoization table indexed by (i, m, n).
  3. Define dfs(i, m, n) as before, but check the cache first and store results before returning.
  4. Early termination: If both m and n are 0, we cannot include any more strings.
  5. Return dfs(0, m, n).
class Solution:
    def findMaxForm(self, strs: List[str], m: int, n: int) -> int:
        arr = [[0] * 2 for _ in range(len(strs))]
        for i, s in enumerate(strs):
            for c in s:
                arr[i][ord(c) - ord('0')] += 1

        dp = {}

        def dfs(i, m, n):
            if i == len(strs):
                return 0
            if m == 0 and n == 0:
                return 0
            if (i, m, n) in dp:
                return dp[(i, m, n)]

            res = dfs(i + 1, m, n)
            if m >= arr[i][0] and n >= arr[i][1]:
                res = max(res, 1 + dfs(i + 1, m - arr[i][0], n - arr[i][1]))
            dp[(i, m, n)] = res
            return res

        return dfs(0, m, n)

Time & Space Complexity

  • Time complexity: O(mnN)O(m * n * N)
  • Space complexity: O(mnN)O(m * n * N)

Where NN represents the number of binary strings, and mm and nn are the maximum allowable counts of zeros and ones, respectively.


3. Dynamic Programming (Bottom-Up)

Intuition

We can convert the top-down approach to bottom-up by building the solution iteratively. We process strings one by one and, for each combination of remaining zeros and ones budget, compute the maximum strings achievable.

The DP table dp[i][j][k] represents the maximum strings from the first i strings using at most j zeros and k ones.

Algorithm

  1. Preprocess each string to count its zeros and ones.
  2. Create a 3D DP table of size (len(strs) + 1) x (m + 1) x (n + 1), initialized to 0.
  3. For each string i from 1 to len(strs):
    • For each zeros budget j from 0 to m:
      • For each ones budget k from 0 to n:
        • Copy the value from the previous string: dp[i][j][k] = dp[i-1][j][k].
        • If we can afford the current string (j >= zeros and k >= ones):
          • Update: dp[i][j][k] = max(dp[i][j][k], 1 + dp[i-1][j-zeros][k-ones]).
  4. Return dp[len(strs)][m][n].
class Solution:
    def findMaxForm(self, strs: List[str], m: int, n: int) -> int:
        arr = [[0] * 2 for _ in range(len(strs))]
        for i, s in enumerate(strs):
            for c in s:
                arr[i][ord(c) - ord('0')] += 1

        dp = [[[0] * (n + 1) for _ in range(m + 1)] for _ in range(len(strs) + 1)]

        for i in range(1, len(strs) + 1):
            for j in range(m + 1):
                for k in range(n + 1):
                    dp[i][j][k] = dp[i - 1][j][k]
                    if j >= arr[i - 1][0] and k >= arr[i - 1][1]:
                        dp[i][j][k] = max(dp[i][j][k], 1 + dp[i - 1][j - arr[i - 1][0]][k - arr[i - 1][1]])

        return dp[len(strs)][m][n]

Time & Space Complexity

  • Time complexity: O(mnN)O(m * n * N)
  • Space complexity: O(mnN)O(m * n * N)

Where NN represents the number of binary strings, and mm and nn are the maximum allowable counts of zeros and ones, respectively.


4. Dynamic Programming (Space Optimized)

Intuition

Notice that when computing dp[i], we only need values from dp[i-1]. This means we can reduce the 3D table to a 2D table.

The key trick is to iterate the budgets in reverse order. When we update dp[j][k], we need the old values of dp[j-zeros][k-ones]. By iterating backward, we ensure these values have not been overwritten yet in the current iteration.

Algorithm

  1. Preprocess each string to count its zeros and ones.
  2. Create a 2D DP table of size (m + 1) x (n + 1), initialized to 0.
  3. For each string with zeros zeros and ones ones:
    • For j from m down to zeros:
      • For k from n down to ones:
        • Update: dp[j][k] = max(dp[j][k], 1 + dp[j-zeros][k-ones]).
  4. Return dp[m][n].
class Solution:
    def findMaxForm(self, strs: List[str], m: int, n: int) -> int:
        arr = [[0, 0] for _ in range(len(strs))]
        for i, s in enumerate(strs):
            for c in s:
                arr[i][ord(c) - ord('0')] += 1

        dp = [[0] * (n + 1) for _ in range(m + 1)]

        for zeros, ones in arr:
            for j in range(m, zeros - 1, -1):
                for k in range(n, ones - 1, -1):
                    dp[j][k] = max(dp[j][k], 1 + dp[j - zeros][k - ones])

        return dp[m][n]

Time & Space Complexity

  • Time complexity: O(mnN)O(m * n * N)
  • Space complexity: O(mn+N)O(m * n + N)

Where NN represents the number of binary strings, and mm and nn are the maximum allowable counts of zeros and ones, respectively.


Common Pitfalls

Iterating Forward Instead of Backward in Space-Optimized DP

When using the 2D space-optimized solution, iterating from small values to large values causes the same string to be counted multiple times in one iteration. You must iterate j from m down to zeros and k from n down to ones to ensure each string is used at most once per subset.

Confusing Zeros and Ones Counts

Mixing up which index stores the count of zeros versus ones leads to incorrect budget checks. Consistently use index 0 for zeros and index 1 for ones (or vice versa) throughout the solution, and ensure the comparison matches this convention.

Treating This as Unbounded Knapsack

This is a 0/1 knapsack problem where each string can be selected at most once. Solutions that allow selecting the same string multiple times will overcount and return incorrect results. Each string must be processed exactly once, either included or excluded.