703. Kth Largest Element In a Stream - Explanation

Problem Link

Description

Design a class to find the kth largest integer in a stream of values, including duplicates. E.g. the 2nd largest from [1, 2, 3, 3] is 3. The stream is not necessarily sorted.

Implement the following methods:

  • constructor(int k, int[] nums) Initializes the object given an integer k and the stream of integers nums.
  • int add(int val) Adds the integer val to the stream and returns the kth largest integer in the stream.

Example 1:

Input:
["KthLargest", [3, [1, 2, 3, 3]], "add", [3], "add", [5], "add", [6], "add", [7], "add", [8]]

Output:
[null, 3, 3, 3, 5, 6]

Explanation:
KthLargest kthLargest = new KthLargest(3, [1, 2, 3, 3]);
kthLargest.add(3);   // return 3
kthLargest.add(5);   // return 3
kthLargest.add(6);   // return 3
kthLargest.add(7);   // return 5
kthLargest.add(8);   // return 6

Constraints:

  • 1 <= k <= 1000
  • 0 <= nums.length <= 1000
  • -1000 <= nums[i] <= 1000
  • -1000 <= val <= 1000
  • There will always be at least k integers in the stream when you search for the kth integer.


Topics

Recommended Time & Space Complexity

You should aim for a solution with O(mlogk) time and O(k) space, where m is the number of times add() is called, and k represents the rank of the largest number to be tracked (i.e., the k-th largest element).


Hint 1

A brute force solution would involve sorting the array in every time a number is added using add(), and then returning the k-th largest element. This would take O(m * nlogn) time, where m is the number of calls to add() and n is the total number of elements added. However, do we really need to track all the elements added, given that we only need the k-th largest element? Maybe you should think of a data structure which can maintain only the top k largest elements.


Hint 2

We can use a Min-Heap, which stores elements and keeps the smallest element at its top. When we add an element to the Min-Heap, it takes O(logk) time since we are storing k elements in it. Retrieving the top element (the smallest in the heap) takes O(1) time. How can this be useful for finding the k-th largest element?


Hint 3

The k-th largest element is the smallest element among the top k largest elements. This means we only need to maintain k elements in our Min-Heap to efficiently determine the k-th largest element. Whenever the size of the Min-Heap exceeds k, we remove the smallest element by popping from the heap. How do you implement this?


Hint 4

We initialize a Min-Heap with the elements of the input array. When the add() function is called, we insert the new element into the heap. If the heap size exceeds k, we remove the smallest element (the root of the heap). Finally, the top element of the heap represents the k-th largest element and is returned.


Company Tags


Prerequisites

Before attempting this problem, you should be comfortable with:

  • Heap Data Structure - Understanding min-heaps and how they maintain the smallest element at the root
  • Priority Queues - Using built-in heap implementations for efficient insertion and extraction
  • Sorting Algorithms - Understanding basic sorting as a baseline approach

1. Sorting

Intuition

We want the k-th largest number in a stream of values.
The simplest approach:
Every time a new value comes in, insert it, sort the list, and then pick the element at position len(arr) - k.

Sorting keeps the numbers in increasing order, so the k-th largest element will always sit at the same index.
This method is easy to understand but slow because sorting happens every time add() is called.

Algorithm

Initialization

  • Store k.
  • Store the initial numbers in an array.

add(val)

  1. Append val to the array.
  2. Sort the array.
  3. Return the element at index len(arr) - k (the k-th largest).
class KthLargest:

    def __init__(self, k: int, nums: List[int]):
        self.k = k
        self.arr = nums

    def add(self, val: int) -> int:
        self.arr.append(val)
        self.arr.sort()
        return self.arr[len(self.arr) - self.k]

Time & Space Complexity

  • Time complexity: O(mnlogn)O(m * n\log n)
  • Space complexity:
    • O(m)O(m) extra space.
    • O(1)O(1) or O(n)O(n) space depending on the sorting algorithm.

Where mm is the number of calls made to add()add() and nn is the current size of the array.


2. Min-Heap

Intuition

To maintain the k-th largest element in a stream of numbers, we do not need to store all values.
Instead, we only need to keep track of the k largest elements seen so far.

A min-heap of size k is perfect for this:

  • A min-heap always keeps the smallest value at the top.
  • If the heap contains the k largest elements,
    then the smallest among them is exactly the k-th largest overall.
  • Whenever a new number arrives:
    • If we add it and the heap grows beyond k,
      we remove the smallest element - because it cannot be in the top k anymore.

This way, the heap always holds exactly the top k elements, and retrieving the k-th largest is O(1).

Algorithm

Initialization

  1. Insert all initial numbers into a min-heap.
  2. If the heap size becomes greater than k, repeatedly remove the smallest element.
    • After this, the heap contains exactly k elements.

add(value)

  1. Insert the new value into the min-heap.
  2. If heap size > k:
    • Remove the smallest element (the heap root).
  3. Return the heap's smallest element (the root), which is now the k-th largest.
class KthLargest:

    def __init__(self, k: int, nums: List[int]):
        self.minHeap, self.k = nums, k
        heapq.heapify(self.minHeap)
        while len(self.minHeap) > k:
            heapq.heappop(self.minHeap)

    def add(self, val: int) -> int:
        heapq.heappush(self.minHeap, val)
        if len(self.minHeap) > self.k:
            heapq.heappop(self.minHeap)
        return self.minHeap[0]

Time & Space Complexity

  • Time complexity: O(mlogk)O(m * \log k)
  • Space complexity: O(k)O(k)

Where mm is the number of calls made to add()add().


Common Pitfalls

Using a Max-Heap Instead of Min-Heap

A common mistake is using a max-heap to find the k-th largest element. While it seems intuitive to keep the largest elements at the top, a max-heap would require storing all elements and repeatedly extracting the maximum k times for each query. A min-heap of size k is the correct choice because the root always holds the k-th largest element directly, allowing O(1) retrieval after each insertion.

Forgetting to Maintain Heap Size

When using a min-heap, it is essential to remove the smallest element whenever the heap size exceeds k. Failing to do so means the heap grows unbounded, and the root no longer represents the k-th largest element. Always check and pop after each insertion to keep exactly k elements in the heap.