Prerequisites

Before attempting this problem, you should be comfortable with:

  • Prefix Sums - Used to build cumulative weight ranges for efficient probability mapping
  • Binary Search - Required for the optimized O(log n) solution to find the target weight segment
  • Probability Concepts - Understanding weighted random selection where larger weights have proportionally higher selection chances

Intuition

To pick an index with probability proportional to its weight, imagine laying all weights on a number line. A weight of 3 takes up 3 units, a weight of 1 takes up 1 unit, and so on. We pick a random point on this line, then find which weight's segment contains that point. The larger the weight, the more likely its segment gets hit.

Algorithm

  1. In the constructor, store the weights and compute their total sum.
  2. For pickIndex():
    • Generate a random number between 0 and the total sum.
    • Iterate through weights, accumulating a running sum.
    • Return the first index where the running sum exceeds the random target.
  3. This linear scan takes O(n) per pick.
class Solution:

    def __init__(self, w: List[int]):
        self.w = w
        self.total = sum(w)

    def pickIndex(self) -> int:
        target = self.total * random.random()
        curSum = 0
        for i in range(len(self.w)):
            curSum += self.w[i]
            if curSum > target:
                return i

Time & Space Complexity

  • Time complexity: O(n)O(n) for initializing and O(n)O(n) for each pickIndex()pickIndex() function call.
  • Space complexity: O(n)O(n)

Intuition

The linear search can be optimized using binary search. By precomputing a prefix sum array, each index i represents the cumulative weight up to that point. When we generate a random target, we binary search for the smallest index whose prefix sum exceeds the target. This maps directly to the weight segment containing our random point.

Algorithm

  1. In the constructor, build a prefix sum array where prefix[i] is the sum of weights from index 0 to i - 1.
  2. For pickIndex():
    • Generate a random number between 0 and the total sum (last element of prefix array).
    • Binary search for the leftmost index where the prefix sum exceeds the target.
    • Return that index minus 1 (adjusting for the prefix array offset).
  3. This achieves O(log n) per pick.
class Solution:

    def __init__(self, w: List[int]):
        self.prefix = [0]
        for wgt in w:
            self.prefix.append(self.prefix[-1] + wgt)

    def pickIndex(self) -> int:
        target = self.prefix[-1] * random.random()
        l, r = 1, len(self.prefix)
        while l < r:
            mid = (l + r) >> 1
            if self.prefix[mid] <= target:
                l = mid + 1
            else:
                r = mid
        return l - 1

Time & Space Complexity

  • Time complexity: O(n)O(n) for initializing and O(logn)O(\log n) for each pickIndex()pickIndex() function call.
  • Space complexity: O(n)O(n)

Common Pitfalls

The prefix sum array has one extra element compared to the weights array when using a leading zero. When binary searching, forgetting to adjust indices correctly leads to returning the wrong index or accessing out-of-bounds memory. Always verify that your search range and return value account for the offset between prefix sum indices and original weight indices.

Using Integer Random Instead of Floating Point

Generating a random integer in [0, total) instead of a floating-point value in [0, total) can cause subtle probability errors. For example, with weights [1, 1], using integer random would give exactly 50-50 distribution, but the edge case handling differs. Using random() * total with proper floating-point comparison ensures correct probability distribution matching the weight ratios.

Using < instead of <= (or vice versa) when comparing the prefix sum to the target changes which index gets selected. The correct approach finds the first index where the prefix sum strictly exceeds the random target. Using >= instead of > shifts all probabilities by one index, causing the first element to be selected too often and the last element too rarely.