To minimize the number of substrings, we want each substring to be as long as possible while containing only unique characters. A greedy approach works perfectly here: extend the current substring until we encounter a duplicate character, then start a new substring.
We use a hash set to track characters in the current substring. When we see a character already in the set, we've found a duplicate, so we increment our count and clear the set to start fresh.
curSet to track characters in the current partition and res = 1 for the result count.curSet, increment res and clear the set.curSet.res.Instead of using a set and clearing it on each partition, we can track the last index where each character appeared. A character causes a conflict only if its last occurrence is within the current partition (at or after the start index).
This approach avoids the overhead of clearing the set and uses constant extra space since we only need to track 26 lowercase letters.
lastIdx of size 26 initialized to -1, tracking the last seen index of each character.res = 1 and start = 0 to mark the beginning of the current partition.i:lastIdx[char] >= start, this character appeared in the current partition, so start a new partition by setting start = i and incrementing res.lastIdx[char] = i.res.Since we only have 26 lowercase letters, we can represent the set of characters in the current partition using a single integer as a bitmask. Each bit position corresponds to a letter (bit 0 for 'a', bit 1 for 'b', etc.).
This is the most space-efficient approach and uses fast bitwise operations to check membership and add characters.
res = 1 and mask = 0.i = char - 'a'.i is already set in mask (i.e., mask & (1 << i) is non-zero), we have a duplicate:mask = 0 and increment res.mask |= (1 << i).res.A common mistake is initializing res = 0 instead of res = 1. Since we are counting partitions and the string is non-empty, we always have at least one partition. Starting at zero would undercount by one because the final substring after the last split is never explicitly counted when we only increment on encountering duplicates.
When a duplicate is found, some implementations clear the set but forget to add the current character to the fresh set. This leads to incorrect behavior because the duplicate character that triggered the new partition should be the first character of that new partition.
Some solvers confuse this with finding all unique substrings or counting unique characters globally. The goal is to partition the string such that each partition has unique characters within itself, not across the entire string. Each partition is independent and can reuse characters from previous partitions.