Before attempting this problem, you should be comfortable with:
To pick an index with probability proportional to its weight, imagine laying all weights on a number line. A weight of 3 takes up 3 units, a weight of 1 takes up 1 unit, and so on. We pick a random point on this line, then find which weight's segment contains that point. The larger the weight, the more likely its segment gets hit.
pickIndex():0 and the total sum.O(n) per pick.The linear search can be optimized using binary search. By precomputing a prefix sum array, each index i represents the cumulative weight up to that point. When we generate a random target, we binary search for the smallest index whose prefix sum exceeds the target. This maps directly to the weight segment containing our random point.
prefix[i] is the sum of weights from index 0 to i - 1.pickIndex():0 and the total sum (last element of prefix array).1 (adjusting for the prefix array offset).O(log n) per pick.class Solution:
def __init__(self, w: List[int]):
self.prefix = [0]
for wgt in w:
self.prefix.append(self.prefix[-1] + wgt)
def pickIndex(self) -> int:
target = self.prefix[-1] * random.random()
l, r = 1, len(self.prefix)
while l < r:
mid = (l + r) >> 1
if self.prefix[mid] <= target:
l = mid + 1
else:
r = mid
return l - 1The prefix sum array has one extra element compared to the weights array when using a leading zero. When binary searching, forgetting to adjust indices correctly leads to returning the wrong index or accessing out-of-bounds memory. Always verify that your search range and return value account for the offset between prefix sum indices and original weight indices.
Generating a random integer in [0, total) instead of a floating-point value in [0, total) can cause subtle probability errors. For example, with weights [1, 1], using integer random would give exactly 50-50 distribution, but the edge case handling differs. Using random() * total with proper floating-point comparison ensures correct probability distribution matching the weight ratios.
Using < instead of <= (or vice versa) when comparing the prefix sum to the target changes which index gets selected. The correct approach finds the first index where the prefix sum strictly exceeds the random target. Using >= instead of > shifts all probabilities by one index, causing the first element to be selected too often and the last element too rarely.