1985. Find The Kth Largest Integer In The Array - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Custom Comparators - Needed to compare numeric strings by length first, then lexicographically
  • Heap/Priority Queue - Used in heap-based solutions to efficiently find the kth largest element
  • Quick Select Algorithm - An optimization for finding the kth element in average O(n) time
  • String Comparison - Understanding why default string comparison fails for numeric strings of different lengths

1. Sorting

Intuition

Since numbers are represented as strings and can be very large (up to 100 digits), we cannot convert them to integers directly. Instead, we compare strings by length first, then lexicographically. A longer string represents a larger number, and for equal lengths, lexicographic order matches numeric order. After sorting in descending order, the kth element is our answer.

Algorithm

  1. Sort the array of strings in descending order using a custom comparator:
    • If two strings have different lengths, the longer one is larger.
    • If they have the same length, compare them lexicographically.
  2. Return the element at index k - 1 (since the array is 0-indexed).
class Solution:
    def kthLargestNumber(self, nums: List[str], k: int) -> str:
        return sorted(nums, key=lambda x: (len(x), x), reverse=True)[k - 1]

Time & Space Complexity

  • Time complexity: O(mnlogn)O(m * n \log n)
  • Space complexity: O(1)O(1) or O(n)O(n) depending on the sorting algorithm.

Where nn is the number of strings and mm is the average length of a string.


2. Max-Heap

Intuition

A max-heap keeps the largest element at the top. By inserting all elements into a max-heap and extracting the maximum k times, the kth extraction gives us the kth largest element. The custom comparator ensures proper ordering for string-represented numbers.

Algorithm

  1. Create a max-heap with a custom comparator that compares strings as numbers (length first, then lexicographically).
  2. Insert all elements from the array into the heap.
  3. Extract the maximum element k - 1 times and discard them.
  4. Extract once more and return this element as the kth largest.
class Num:
    def __init__(self, s: str):
        self.s = s

    def __lt__(self, other: "Num") -> bool:
        if len(self.s) != len(other.s):
            return len(self.s) > len(other.s)
        return self.s > other.s

class Solution:
    def kthLargestNumber(self, nums: List[str], k: int) -> str:
        maxHeap = [Num(s) for s in nums]
        heapq.heapify(maxHeap)

        for _ in range(k - 1):
            heapq.heappop(maxHeap)

        return heapq.heappop(maxHeap).s

Time & Space Complexity

  • Time complexity: O(m(n+k)logn)O(m * (n + k) * \log n)
  • Space complexity: O(n)O(n)

Where nn is the number of strings and mm is the average length of a string.


3. Min-Heap

Intuition

Instead of storing all elements and extracting k times, we can maintain a min-heap of size k. The smallest element in this heap is always at the top. As we process elements, if the heap size exceeds k, we remove the smallest. After processing all elements, the heap contains the k largest elements, and the top (minimum of these k) is the kth largest overall.

Algorithm

  1. Create a min-heap with a custom comparator for string-represented numbers.
  2. For each element in the array:
    • Add it to the heap.
    • If the heap size exceeds k, remove the minimum element.
  3. After processing all elements, the top of the heap is the kth largest element.
class Num:
    def __init__(self, s: str):
        self.s = s

    def __lt__(self, other: "Num") -> bool:
        if len(self.s) != len(other.s):
            return len(self.s) < len(other.s)
        return self.s < other.s

class Solution:
    def kthLargestNumber(self, nums: List[str], k: int) -> str:
        minHeap = []
        for num in nums:
            heapq.heappush(minHeap, Num(num))
            if len(minHeap) > k:
                heapq.heappop(minHeap)
        return minHeap[0].s

Time & Space Complexity

  • Time complexity: O(mnlogk)O(m * n * \log k)
  • Space complexity: O(k)O(k)

Where nn is the number of strings and mm is the average length of a string.


4. Quick Select

Intuition

Quick Select is a selection algorithm based on the partitioning step of QuickSort. Instead of fully sorting the array, we only partition around a pivot and recurse into the side that contains our target index. On average, this finds the kth element in linear time. The algorithm uses median-of-three pivot selection to improve performance and avoid worst-case scenarios.

Algorithm

  1. Define comparison functions that compare strings as numbers (by length first, then lexicographically).
  2. Use the partition function:
    • Select a pivot using median-of-three (first, middle, last elements).
    • Partition the array so elements greater than the pivot are on the left, and smaller ones are on the right.
  3. Perform quick select:
    • If the search range is small (2 or fewer elements), sort them directly.
    • Otherwise, partition and recurse into the appropriate half based on where index k - 1 falls relative to the pivot position.
  4. Return the element at index k - 1.
class Solution:
    def greater(self, x: str, y: str) -> bool:
        if len(x) != len(y):
            return len(x) > len(y)
        return x > y

    def less(self, x: str, y: str) -> bool:
        if len(x) != len(y):
            return len(x) < len(y)
        return x < y

    def partition(self, nums: List[str], left: int, right: int) -> int:
        mid = (left + right) >> 1
        nums[mid], nums[left + 1] = nums[left + 1], nums[mid]

        if self.less(nums[left], nums[right]):
            nums[left], nums[right] = nums[right], nums[left]
        if self.less(nums[left + 1], nums[right]):
            nums[left + 1], nums[right] = nums[right], nums[left + 1]
        if self.less(nums[left], nums[left + 1]):
            nums[left], nums[left + 1] = nums[left + 1], nums[left]

        pivot = nums[left + 1]
        i = left + 1
        j = right

        while True:
            while True:
                i += 1
                if not self.greater(nums[i], pivot):
                    break
            while True:
                j -= 1
                if not self.less(nums[j], pivot):
                    break
            if i > j:
                break
            nums[i], nums[j] = nums[j], nums[i]

        nums[left + 1], nums[j] = nums[j], nums[left + 1]
        return j

    def quickSelect(self, nums: List[str], k: int) -> str:
        left = 0
        right = len(nums) - 1

        while True:
            if right <= left + 1:
                if right == left + 1 and self.greater(nums[right], nums[left]):
                    nums[left], nums[right] = nums[right], nums[left]
                return nums[k]

            j = self.partition(nums, left, right)
            if j >= k:
                right = j - 1
            if j <= k:
                left = j + 1

    def kthLargestNumber(self, nums: List[str], k: int) -> str:
        return self.quickSelect(nums, k - 1)

Time & Space Complexity

  • Time complexity: O(mn)O(m * n) in average case, O(mn2)O(m * n ^ 2) in worst case.
  • Space complexity: O(1)O(1)

Common Pitfalls

Converting Strings to Integers Directly

Numbers can have up to 100 digits, far exceeding the range of standard integer types (even 64-bit). Attempting to convert these strings to integers causes overflow and incorrect comparisons. Always compare strings using length-first, then lexicographic comparison.

Using Default String Comparison

Default lexicographic comparison does not work correctly for numeric strings of different lengths. For example, "9" would be considered greater than "123" lexicographically, but numerically 123 > 9. The comparator must first compare by string length, then lexicographically for equal lengths.

Off-by-One Error with k-th Element

The problem asks for the kth largest element using 1-based indexing, but arrays are 0-indexed. After sorting in descending order, the answer is at index k - 1, not k. This is a frequent source of incorrect answers.