1647. Minimum Deletions to Make Character Frequencies Unique - Explanation

Problem Link



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Hash Set - Tracks which frequencies are already used to detect and resolve conflicts
  • Frequency Counting - Counting character occurrences is the foundation for all approaches
  • Greedy Algorithms - Decrementing frequencies until finding an available slot minimizes deletions
  • Heap / Priority Queue - The max-heap approach processes frequencies from largest to smallest

1. Hash Set

Intuition

For frequencies to be unique, no two characters can have the same count. When we encounter a frequency that already exists, we must delete characters until we reach an unused frequency (or zero). A hash set tracks which frequencies are already taken. For each character's frequency, we decrement it until we find an available slot, counting each decrement as a deletion.

Algorithm

  1. Count the frequency of each character.
  2. Create a hash set to track used frequencies.
  3. For each frequency:
    • While the frequency is positive and already in the set, decrement it and count a deletion.
    • Add the resulting frequency to the set.
  4. Return the total deletion count.
class Solution:
    def minDeletions(self, s: str) -> int:
        count = [0] * 26
        for c in s:
            count[ord(c) - ord('a')] += 1

        used_freq = set()
        res = 0
        for freq in count:
            while freq > 0 and freq in used_freq:
                freq -= 1
                res += 1
            used_freq.add(freq)

        return res

Time & Space Complexity

  • Time complexity: O(n+m2)O(n + m ^ 2)
  • Space complexity: O(m)O(m)

Where nn is the length of the string ss and mm is the total number of unique frequncies possible.


2. Max-Heap

Intuition

Using a max-heap, we process frequencies from largest to smallest. When the top two frequencies are equal, we have a conflict. We resolve it by decrementing one of them and pushing the reduced value back (if still positive). This greedy approach ensures we minimize deletions by keeping larger frequencies intact when possible.

Algorithm

  1. Count character frequencies and build a max-heap.
  2. While more than one frequency remains:
    • Pop the largest frequency.
    • If it equals the next largest, decrement it, count a deletion, and push back if positive.
  3. Return the total deletion count.
class Solution:
    def minDeletions(self, s: str) -> int:
        freq = Counter(s)
        maxHeap = [-f for f in freq.values()]
        heapq.heapify(maxHeap)

        res = 0
        while len(maxHeap) > 1:
            top = -heapq.heappop(maxHeap)
            if top == -maxHeap[0]:
                if top - 1 > 0:
                    heapq.heappush(maxHeap, -(top - 1))
                res += 1

        return res

Time & Space Complexity

  • Time complexity: O(n+m2logm)O(n + m ^ 2 \log m)
  • Space complexity: O(m)O(m)

Where nn is the length of the string ss and mm is the total number of unique frequncies possible.


3. Sorting

Intuition

By sorting frequencies in descending order, we process them from highest to lowest. We track the maximum allowed frequency for the next character. If a frequency exceeds this limit, we delete down to the limit. After each character, the next allowed frequency decreases by one (minimum 0). This ensures all final frequencies are distinct.

Algorithm

  1. Count character frequencies and sort in descending order.
  2. Set maxAllowedFreq to the highest frequency.
  3. For each frequency:
    • If it exceeds maxAllowedFreq, add the difference to deletions.
    • Update maxAllowedFreq to max(0, current_frequency - 1).
  4. Return the total deletion count.
class Solution:
    def minDeletions(self, s: str) -> int:
        count = [0] * 26
        for c in s:
            count[ord(c) - ord('a')] += 1

        count.sort(reverse=True)
        res = 0
        maxAllowedFreq = count[0]

        for freq in count:
            if freq > maxAllowedFreq:
                res += freq - maxAllowedFreq
                freq = maxAllowedFreq
            maxAllowedFreq = max(0, freq - 1)

        return res

Time & Space Complexity

  • Time complexity: O(n+mlogm)O(n + m \log m)
  • Space complexity: O(m)O(m)

Where nn is the length of the string ss and mm is the total number of unique frequncies possible.


Common Pitfalls

Adding Zero Frequencies to the Used Set

When a frequency is decremented to zero, adding it to the used frequency set prevents other characters from also being reduced to zero. Multiple characters can validly have frequency zero (meaning they are completely deleted), so zero should either not be added or handled specially.

Not Decrementing Until an Available Slot is Found

Some implementations only check if a frequency is taken and decrement once. The correct approach requires a while loop that continues decrementing until finding an unused frequency or reaching zero. A single decrement may land on another taken frequency.

Counting Characters Instead of Frequencies

The problem asks for minimum deletions to make frequencies unique, not to make characters unique. Confusing the two leads to counting unique characters or deleting entire character types rather than reducing specific frequency counts.