287. Find The Duplicate Number - Explanation

Description

You are given an array of integers nums containing n + 1 integers. Each integer in nums is in the range [1, n] inclusive.

Every integer appears exactly once, except for one integer which appears two or more times. Return the integer that appears more than once.

Example 1:

Input: nums = [1,2,3,2,2]

Output: 2

Example 2:

Input: nums = [1,2,3,4,4]

Output: 4

Follow-up: Can you solve the problem without modifying the array nums and using $O(1)$ extra space?

Constraints:

1 <= n <= 10000
nums.length == n + 1
1 <= nums[i] <= n

Recommended Time & Space Complexity

You should aim for a solution with O(n) time and O(1) space, where n is the size of the input array.

Hint 1

A naive approach would be to use a hash set, which takes O(1) time to detect duplicates. Although this solution is acceptable, it requires O(n) extra space. Can you think of a better solution that avoids using extra space? Consider that the elements in the given array nums are within the range 1 to len(nums).

Hint 2

We can use the given input array itself as a hash set without creating a new one. This can be achieved by marking the positions (0-indexed) corresponding to the elements that have already been encountered. Can you implement this?

Hint 3

We iterate through the array using index i. For each element, we use its absolute value to find the corresponding index and mark that position as negative: nums[abs(nums[i]) - 1] *= -1. Taking absolute value ensures we work with the original value even if it’s already negated. How can you detect duplicates?

Hint 4

For example, in the array [2, 1, 2, 3], where 2 is repeated, we mark the index corresponding to each element as negative. If we encounter a number whose corresponding position is already negative, it means the number is a duplicate, and we return it.

Company Tags

Please upgrade to NeetCode Pro to view company tags.

1. Sorting

Intuition

If we sort the array, any duplicate numbers will appear next to each other.
So after sorting, we just scan once and check if any two consecutive elements are equal.
The first equal pair we find is the duplicate.

Algorithm

Sort the array.
Loop through the array from index 0 to n - 2.
For each index i, check if nums[i] == nums[i + 1].
- If yes, return that value (this is the duplicate).
If no duplicate is found (theoretically impossible for this problem), return -1.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        nums.sort()
        for i in range(len(nums) - 1):
            if nums[i] == nums[i + 1]:
                return nums[i]
        return -1

Time & Space Complexity

Time complexity: $O(n \log n)$
Space complexity: $O(1)$ or $O(n)$ depending on the sorting algorithm.

2. Hash Set

Intuition

We can detect duplicates by remembering which numbers we have already seen.
As we scan the array, each new number is checked:

If it's not in the set, we add it.
If it is already in the set, that number must be the duplicate.

A set gives constant-time lookup, so this approach is simple and efficient.

Algorithm

Create an empty hash set seen.
Loop through each number in the array:
- If the number is already in seen, return it (this is the duplicate).
- Otherwise, insert the number into seen.
If no duplicate is found (should not happen in this problem), return -1.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        seen = set()
        for num in nums:
            if num in seen:
                return num
            seen.add(num)
        return -1

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

3. Array

Intuition

Since the values in the array are from 1 to n, we can use an array to track whether we’ve seen a number before.
Each number directly maps to an index (num - 1).

If that index is already marked, we’ve seen the number before → it's the duplicate.
Otherwise, we mark it as seen.

This avoids using a hash set while still providing fast lookups.

Algorithm

Create an array seen of size n filled with zeros.
For each number num in the input array:
- Check if seen[num - 1] is already set to 1.
  - If yes → return num (duplicate found).
- Otherwise, set seen[num - 1] = 1.
Return -1 if no duplicate is found (though the problem guarantees one).

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        seen = [0] * len(nums)
        for num in nums:
            if seen[num - 1]:
                return num
            seen[num - 1] = 1
        return -1

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

4. Negative Marking

Intuition

Since every value is between 1 and n, each number corresponds to an index in the array (num - 1).
We can use the array itself as a marking tool:

When we see a number, we go to its corresponding index and flip the sign of the value there.
If we ever visit an index that is already negative, it means we've visited this number before → it's the duplicate.

This method avoids extra memory and uses the input array as a tracking structure.

Algorithm

Loop through every number num in the array.
Compute its corresponding index: idx = abs(num) - 1.
If nums[idx] is already negative:
- A duplicate has been found → return abs(num).
Otherwise:
- Mark the index by multiplying nums[idx] by -1.
If no duplicate is found (though guaranteed), return -1.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        for num in nums :
            idx = abs(num) - 1
            if nums[idx] < 0 :
                return abs(num)
            nums[idx] *= -1
        return -1

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(1)$

5. Binary Search

Intuition

This method uses binary search on the value range, not on the array itself.

If all numbers from 1 to mid appeared at most once, then the count of numbers ≤ mid should be ≤ mid.
But if the count is greater than mid, it means the duplicate must be in the range [1, mid], because too many numbers fall into that range.

So we repeatedly:

Count how many values are ≤ mid.
Shrink the search space based on whether this count is “too large.”

Eventually, low == high, and that value is the duplicate.

Algorithm

Let low = 1 and high = n - 1 (value range).
While low < high:
- Compute mid = (low + high) // 2.
- Count how many numbers in the array are ≤ mid.
- If the count is greater than mid, the duplicate lies in [low, mid], so set high = mid.
- Otherwise, it lies in [mid + 1, high], so set low = mid + 1.
When the loop ends, low is the duplicate.
Return low.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        n = len(nums)
        low, high = 1, n - 1
        while low < high:
            mid = low + (high - low) // 2
            lessOrEqual = sum(1 for num in nums if num <= mid)

            if lessOrEqual <= mid:
                low = mid + 1
            else:
                high = mid

        return low

Time & Space Complexity

Time complexity: $O(n \log n)$
Space complexity: $O(1)$

6. Bit Manipulation

Intuition

Every number from 1 to n−1 should appear exactly once, but in the array, one number appears twice.
So for each bit position, we compare:

How many times this bit is set among all numbers in the array.
How many times this bit should be set among the numbers 1 to n-1.

If a bit appears more times in the array than expected, that bit must belong to the duplicate number.

By combining all such bits, we reconstruct the duplicate.

Algorithm

Let res = 0 to build the duplicate number bit by bit.
For each bit position b from 0 to 31:
- Compute mask = 1 << b.
- Count how many numbers in nums have this bit set → call it x.
- Count how many numbers from 1 to n−1 should have this bit set → call it y.
If x > y, it means the duplicate has this bit set, so add it to the answer:
- res |= mask.
After processing all bits, return res.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        n = len(nums)
        res = 0
        for b in range(32):
            x = y = 0
            mask = 1 << b
            for num in nums:
                if num & mask:
                    x += 1

            for num in range(1, n):
                if num & mask:
                    y += 1

            if x > y:
                res |= mask
        return res

Time & Space Complexity

Time complexity: $O(32 * n)$
Space complexity: $O(1)$

7. Fast And Slow Pointers

Intuition

Treat the array like a linked list, where each index points to the next index given by its value.
Because one number is duplicated, two indices will point into the same chain, creating a cycle — exactly like a linked list with a loop.

Using Floyd’s Fast & Slow Pointer technique:

The slow pointer moves one step at a time.
The fast pointer moves two steps at a time.
If there’s a cycle, they will eventually meet.

Once they meet, we start a new pointer from the beginning:

Move both pointers one step at a time.
The point where they meet again is the duplicate number (the entry point of the cycle).

Algorithm

Initialize two pointers slow = 0 and fast = 0.
Move:
- slow = nums[slow]
- fast = nums[nums[fast]]
  until slow == fast.
Start a new pointer slow2 = 0.
Move both:
- slow = nums[slow]
- slow2 = nums[slow2]
  until they meet.
The meeting point is the duplicate number.
Return that number.

class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow, fast = 0, 0
        while True:
            slow = nums[slow]
            fast = nums[nums[fast]]
            if slow == fast:
                break

        slow2 = 0
        while True:
            slow = nums[slow]
            slow2 = nums[slow2]
            if slow == slow2:
                return slow

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(1)$