Given an array of strings strs, group all anagrams together into sublists. You may return the output in any order.
An anagram is a string that contains the exact same characters as another string, but the order of the characters can be different.
Example 1:
Input: strs = ["act","pots","tops","cat","stop","hat"]
Output: [["hat"],["act", "cat"],["stop", "pots", "tops"]]Example 2:
Input: strs = ["x"]
Output: [["x"]]Example 3:
Input: strs = [""]
Output: [[""]]Constraints:
1 <= strs.length <= 1000.0 <= strs[i].length <= 100strs[i] is made up of lowercase English letters.
You should aim for a solution with O(m * n) time and O(m) space, where m is the number of strings and n is the length of the longest string.
A naive solution would be to sort each string and group them using a hash map. This would be an O(m * nlogn) solution. Though this solution is acceptable, can you think of a better way without sorting the strings?
By the definition of an anagram, we only care about the frequency of each character in a string. How is this helpful in solving the problem?
We can simply use an array of size O(26), since the character set is a through z (26 continuous characters), to count the frequency of each character in a string. Then, we can use this array as the key in the hash map to group the strings.
Anagrams become identical when their characters are sorted.
For example, "eat", "tea", and "ate" all become "aet" after sorting.
By using the sorted version of each string as a key, we can group all anagrams together.
Strings that share the same sorted form must be anagrams, so placing them in the same group is both natural and efficient.
Where is the number of strings and is the length of the longest string.
Instead of sorting each string, we can represent every string by the frequency of its characters.
Since the problem uses lowercase English letters, a fixed-size array of length 26 can capture how many times each character appears.
Two strings are anagrams if and only if their frequency arrays are identical.
By using this frequency array (converted to a tuple so it can be a dictionary key), we can group all strings that share the same character counts.
26-length tuple representing character frequencies, and each value is a list of strings belonging to that anagram group.count array of size 26 with all zeros.c in the string, increment the count at the corresponding index.count array to a tuple and use it as the key.Where is the number of strings and is the length of the longest string.
When using character frequency arrays as keys, you must convert them to an immutable type (like a tuple in Python or a string in other languages). Lists and arrays are mutable and cannot be used as dictionary keys directly.
# Wrong: using list as key
count = [0] * 26
res[count].append(s) # TypeError: unhashable type 'list'
# Correct: convert to tuple
res[tuple(count)].append(s)The frequency array approach with size 26 only works for lowercase English letters. If the input could contain uppercase letters or other characters, the solution would fail or produce incorrect groupings.
# Wrong: will crash on uppercase
count[ord(c) - ord('a')] += 1 # Negative index for uppercase
# Should validate or handle: ord('A') - ord('a') = -32When converting frequency counts to strings, using a naive format like concatenation without separators can cause collisions. For example, counts [1,11] and [11,1] could produce the same string "111".
# Wrong: potential collisions
key = ''.join(str(c) for c in count) # "111" for both [1,11] and [11,1]
# Correct: use separator
key = ','.join(str(c) for c in count) # "1,11" vs "11,1"