An anagram is simply a rearrangement of characters. Two strings are anagrams if they contain the same characters with the same frequencies. The most direct way to check this is to sort both strings and compare them. We can slide through every substring of s that has the same length as p, sort it, and check if it matches the sorted version of p.
p.i from 0 to len(s) - len(p):len(p) starting at i.i to the result list.Where is the length of the string and is the length of the string .
Instead of sorting substrings repeatedly, we can precompute character frequencies using prefix sums. For each position in s, we maintain a cumulative count of each character seen so far. To get the character frequencies for any window, we subtract the prefix count at the start from the prefix count at the end. If the window's character counts match p's counts, we found an anagram.
pCount for the pattern p.prefix[i] stores the cumulative character counts for s[0..i-1].len(p) across s:prefix[j+1] - prefix[i].26 character counts match pCount, add the starting index to the result.class Solution:
def findAnagrams(self, s: str, p: str) -> List[int]:
n, m = len(s), len(p)
if m > n:
return []
pCount = [0] * 26
for c in p:
pCount[ord(c) - ord('a')] += 1
prefix = [[0] * 26 for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(26):
prefix[i][j] = prefix[i - 1][j]
prefix[i][ord(s[i - 1]) - ord('a')] += 1
i, j = 0, m - 1
res = []
while j < n:
isValid = True
for c in range(26):
if prefix[j + 1][c] - prefix[i][c] != pCount[c]:
isValid = False
break
if isValid:
res.append(i)
i += 1
j += 1
return resWhere is the length of the string and is the length of the string .
We can avoid rebuilding frequency counts from scratch by using a sliding window. Start by counting characters in the first window of size len(p). As we slide the window one position to the right, we add the new character entering the window and remove the character leaving it. After each slide, we compare the window's character counts with the pattern's counts.
pCount and sCount for p and the first len(p) characters of s.0 to the result.len(p) to the end of s:sCount.sCount.sCount equals pCount, add the new left index to the result.class Solution:
def findAnagrams(self, s: str, p: str) -> List[int]:
if len(p) > len(s): return []
pCount, sCount = {}, {}
for i in range(len(p)):
pCount[p[i]] = 1 + pCount.get(p[i], 0)
sCount[s[i]] = 1+ sCount.get(s[i], 0)
res = [0] if sCount == pCount else []
l = 0
for r in range(len(p), len(s)):
sCount[s[r]] = 1+ sCount.get(s[r], 0)
sCount[s[l]] -= 1
if sCount[s[l]] == 0:
sCount.pop(s[l])
l += 1
if sCount == pCount:
res.append(l)
return resWhere is the length of the string and is the length of the string .
Comparing two arrays of 26 elements after every slide takes extra time. We can optimize by tracking how many of the 26 character counts currently match between the window and the pattern. When we add or remove a character, we only update the match count for that specific character. If all 26 counts match, we found an anagram.
pCount and sCount for the initial window.26 characters have matching frequencies (initialize match).match == 26, add index 0 to the result.s:match accordingly.match accordingly.match == 26, add the current left index to the result.class Solution:
def findAnagrams(self, s: str, p: str) -> List[int]:
n, m = len(s), len(p)
if m > n:
return []
pCount = [0] * 26
sCount = [0] * 26
for i in range(m):
pCount[ord(p[i]) - ord('a')] += 1
sCount[ord(s[i]) - ord('a')] += 1
match = sum(1 for i in range(26) if pCount[i] == sCount[i])
res = []
if match == 26:
res.append(0)
l = 0
for r in range(m, n):
c = ord(s[l]) - ord('a')
if sCount[c] == pCount[c]:
match -= 1
sCount[c] -= 1
l += 1
if sCount[c] == pCount[c]:
match += 1
c = ord(s[r]) - ord('a')
if sCount[c] == pCount[c]:
match -= 1
sCount[c] += 1
if sCount[c] == pCount[c]:
match += 1
if match == 26:
res.append(l)
return resWhere is the length of the string and is the length of the string .
If the pattern p is longer than the string s, there cannot be any anagrams. Failing to handle this edge case at the start can lead to index out of bounds errors or incorrect empty results being returned for the wrong reason.
When sliding the window, it is easy to make off-by-one errors with the window boundaries. The window must always be exactly len(p) characters wide. Adding a character before removing one, or vice versa, can temporarily create windows of incorrect size and lead to false matches.
Comparing two frequency arrays or maps after every window slide takes O(26) or O(k) time where k is the alphabet size. While this is technically constant, it adds overhead. The optimal approach tracks how many character counts currently match between the window and pattern, updating this count incrementally as characters enter and leave the window.