Prerequisites

Before attempting this problem, you should be comfortable with:

  • Backtracking - Used to explore all possible ways to partition elements into two groups
  • Memoization / Dynamic Programming - Required to optimize the recursive solution by caching subproblem results
  • Subset Sum Problem - This problem is a variant where we find subsets with specific sum constraints
  • Mathematical reasoning with averages - Understanding that equal averages means sum(A) * len(B) == sum(B) * len(A) to avoid floating-point issues

1. Backtracking

Intuition

We need to check if we can partition the array into two non-empty groups with equal averages. The naive approach is to try all possible ways to assign each element to group A or group B. For each complete assignment, we verify if both groups are non-empty and have the same average. Two groups have equal averages when sum(A) * len(B) == sum(B) * len(A).

Algorithm

  1. Define backtrack(i, A, B) to assign element at index i to either group A or B.
  2. Base case: when i == n, check if both groups are non-empty and have equal averages.
  3. For each element, first try adding it to A and recurse. If that fails, remove it from A, add it to B, and recurse.
  4. Return true if any assignment works, false otherwise.
class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        def backtrack(i, A, B):
            if i == len(nums):
                if not A or not B:
                    return False
                return sum(A) * len(B) == sum(B) * len(A)

            A.append(nums[i])
            if backtrack(i + 1, A, B):
                return True
            B.append(nums[i])
            A.pop()
            res = backtrack(i + 1, A, B)
            B.pop()
            return res

        return backtrack(0, [], [])

Time & Space Complexity

  • Time complexity: O(n2n)O(n * 2 ^ n)
  • Space complexity: O(n)O(n)

2. Memoization (Brute Force)

Intuition

The backtracking solution recomputes many states. We can track the state as (index, size of A, sum of A) since B is determined by what remains. If A has size s and sum curSum, then B has size n - s and sum total - curSum. We check the equal average condition at each step and memoize to avoid redundant work.

Algorithm

  1. Calculate the total sum of the array.
  2. Define dfs(i, size, currSum) where size and currSum refer to the current subset.
  3. If size > 0 and size < n, check if currSum * (n - size) == (total - currSum) * size.
  4. At each index, try including or excluding the element in the current subset.
  5. Memoize results keyed by (i, size, currSum).
class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        total = sum(nums)
        n = len(nums)
        memo = {}

        def dfs(i, size, curr_sum):
            if (i, size, curr_sum) in memo:
                return memo[(i, size, curr_sum)]

            if size > 0 and size < n and curr_sum * (n - size) == (total - curr_sum) * size:
                return True

            if i == n:
                return False

            # include nums[i] in A
            if dfs(i + 1, size + 1, curr_sum + nums[i]):
                memo[(i, size, curr_sum)] = True
                return True

            # include nums[i] in B
            if dfs(i + 1, size, curr_sum):
                memo[(i, size, curr_sum)] = True
                return True

            memo[(i, size, curr_sum)] = False
            return False

        return dfs(0, 0, 0)

Time & Space Complexity

  • Time complexity: O(n2s)O(n ^ 2 * s)
  • Space complexity: O(n2s)O(n ^ 2 * s)

Where nn is the size of the input array numsnums, and ss is the sum of the elements of the array.


3. Memoization (Optimal)

Intuition

For two groups to have the same average as the whole array, we need: sum(A) / len(A) = total / n. This means sum(A) = len(A) * total / n. We only need to find a subset A of size a (where 1 <= a <= n/2) with sum exactly a * total / n. This sum must be an integer, so we only check sizes where a * total is divisible by n.

Algorithm

  1. For each valid subset size a from 1 to n/2 where a * total % n == 0:
    • Calculate target sum = a * total / n.
    • Use memoized DFS to check if any subset of size a has this exact sum.
  2. The DFS tries including or excluding each element, tracking remaining size and sum needed.
  3. Return true if any valid subset is found.
class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        n, total = len(nums), sum(nums)
        # len(A) = a, len(B) = b, let a <= b
        # avg(A) = avg(B)
        # sum(A) / a = sum(B) / b = sum(nums) / n
        # sum(A) / a = avg => sum(A) = a * avg
        # sum(A) = a * sum(nums) / n
        # Find if any subset exists with a * sum(nums) / n
        # a is in the range [1, (n//2)]

        memo = {}
        def dfs(i, a, s):
            if (i, a, s) in memo:
                return memo[(i, a, s)]
            if a == 0:
                return s == 0
            if i == n or a < 0:
                return False
            memo[(i, a, s)] = dfs(i + 1, a, s) or dfs(i + 1, a - 1, s - nums[i])
            return memo[(i, a, s)]

        for a in range(1, n // 2 + 1):
            if total * a % n == 0:
                if dfs(0, a, total * a // n):
                    return True

        return False

Time & Space Complexity

  • Time complexity: O(n2s)O(n ^ 2 * s)
  • Space complexity: O(n2s)O(n ^ 2 * s)

Where nn is the size of the input array numsnums, and ss is the sum of the elements of the array.


4. Dynamic Programming (Bottom-Up)

Intuition

We can build all achievable sums for each subset size using dynamic programming. For each element, we update what sums are achievable for each size. We iterate backwards through sizes to avoid using the same element twice. Finally, we check if any valid target sum exists for sizes 1 through n/2.

Algorithm

  1. Create dp[a] as a set of achievable sums for subsets of size a. Initialize dp[0] = {0}.
  2. For each number in the array:
    • For sizes from n/2 down to 1:
      • For each previously achievable sum in dp[a-1], add sum + num to dp[a].
  3. After processing all numbers, check each size a from 1 to n/2:
    • If a * total % n == 0 and the target sum a * total / n exists in dp[a], return true.
  4. Return false if no valid partition exists.
class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        n = len(nums)
        # len(A) = a, len(B) = b, let a <= b
        # avg(A) = avg(B)
        # sum(A) / a = sum(B) / b = sum(nums) / n
        # sum(A) / a = avg => sum(A) = a * avg
        # sum(A) = a * sum(nums) / n
        # Find if any subset exists with a * sum(nums) / n
        # a is in the range [1, (n//2)]

        total = sum(nums)
        dp = [set() for _ in range(n // 2 + 1)]

        dp[0].add(0)
        for num in nums:
            for a in range(n // 2, 0, -1):
                for prev in dp[a - 1]:
                    dp[a].add(prev + num)

        for a in range(1, n // 2 + 1):
            if (a * total % n == 0) and (a * total // n) in dp[a]:
                return True

        return False

Time & Space Complexity

  • Time complexity: O(n2s)O(n ^ 2 * s)
  • Space complexity: O(n2s)O(n ^ 2 * s)

Where nn is the size of the input array numsnums, and ss is the sum of the elements of the array.


Common Pitfalls

Forgetting Both Groups Must Be Non-Empty

A common mistake is checking only whether the averages match without ensuring both groups contain at least one element. If all elements go into group A and group B is empty, that is not a valid split. Always verify that both partitions have size greater than zero before comparing averages.

Using Floating-Point Comparison for Averages

Comparing averages directly using floating-point division can lead to precision errors. Instead of checking sum(A) / len(A) == sum(B) / len(B), use the mathematically equivalent integer comparison sum(A) * len(B) == sum(B) * len(A) to avoid floating-point inaccuracies.

Missing the Early Termination Condition

The condition a * total % n == 0 is crucial for pruning. For a valid subset of size a, the required sum a * total / n must be an integer. Skipping this check leads to unnecessary computation for impossible subset sizes and can cause the solution to time out.

Integer Overflow in Sum Calculations

When computing sums or products like a * total, the result can exceed 32-bit integer limits for large arrays with large values. Use 64-bit integers (long/long long) for intermediate calculations to prevent overflow and incorrect results.

Iterating Over All Subset Sizes Instead of Half

Since finding a valid subset A automatically determines a valid subset B (the remaining elements), you only need to check subset sizes from 1 to n/2. Checking sizes beyond n/2 is redundant because if a subset of size a works, so does the complement of size n - a.