You are given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
Example 1:
Input: haystack = "neetcodeneetcode", needle = "neet"
Output: 0Explanation: "neet" occurs at index 0 and 8.
The first occurrence is at index 0, so we return 0.
Example 2:
Input: haystack = "neetcode", needle = "codem"
Output: -1Constraints:
1 <= haystack.length, needle.length <= 10,000haystack and needle consist of only lowercase English characters.The simplest way to find a substring is to try every possible starting position in the haystack. At each position, we compare characters one by one with the needle. If all characters match, we found our answer. If any character doesn't match, we move to the next starting position and try again.
i in the haystack (from 0 to n - m, where n is the haystack length and m is the needle length).i with characters of the needle.i.-1.Where is the length of the string and is the length of the string .
The brute force approach wastes work by restarting from scratch after each mismatch. KMP improves this by preprocessing the needle to build a "longest proper prefix which is also suffix" (LPS) array. When a mismatch occurs, the LPS array tells us how many characters we can skip, leveraging the pattern structure to avoid redundant comparisons.
prevLPS tracks the length of the current longest prefix-suffix, and i scans through the needle.prevLPS > 0, fall back to the previous LPS value.i for the haystack and j for the needle.j > 0, use the LPS array to determine where to continue matching in the needle.j reaches the needle length, we found a match; return i - m.-1 if no match is found.class Solution:
def strStr(self, haystack: str, needle: str) -> int:
if needle == "": return 0
lps = [0] * len(needle)
prevLPS, i = 0, 1
while i < len(needle):
if needle[i] == needle[prevLPS]:
lps[i] = prevLPS + 1
prevLPS += 1
i += 1
elif prevLPS == 0:
lps[i] = 0
i += 1
else:
prevLPS = lps[prevLPS - 1]
i = 0 # ptr for haystack
j = 0 # ptr for needle
while i < len(haystack):
if haystack[i] == needle[j]:
i, j = i + 1, j + 1
else:
if j == 0:
i += 1
else:
j = lps[j - 1]
if j == len(needle):
return i - len(needle)
return -1Where is the length of the string and is the length of the string .
The Z-algorithm computes, for each position in a string, the length of the longest substring starting from that position that matches a prefix of the string. By concatenating the needle, a separator character, and the haystack, any position where the Z-value equals the needle length indicates a match. The algorithm uses a "Z-box" to track previously computed matches and skip redundant comparisons.
$), and the haystack into a single string.[l, r] representing the rightmost substring that matches a prefix.i, use the Z-box to initialize z[i] when possible.z[i] by comparing characters directly.z[i] equals the needle length, return the corresponding position in the haystack.-1 if no match is found.class Solution:
def strStr(self, haystack: str, needle: str) -> int:
if not needle:
return 0
s = needle + "$" + haystack
n = len(s)
z = [0] * n
l, r = 0, 0
for i in range(1, n):
if i <= r:
z[i] = min(r - i + 1, z[i - l])
while i + z[i] < n and s[z[i]] == s[i + z[i]]:
z[i] += 1
if i + z[i] - 1 > r:
l, r = i, i + z[i] - 1
for i in range(len(needle) + 1, n):
if z[i] == len(needle):
return i - len(needle) - 1
return -1Where is the length of the string and is the length of the string .
Instead of comparing characters one by one, we can compare hash values of substrings. If we compute the hash of the needle and the hash of each window in the haystack, matching hashes suggest a potential match. The key insight is using a rolling hash: when sliding the window by one position, we can update the hash in O(1) time by removing the contribution of the outgoing character and adding the incoming one. Using two different hash functions reduces false positives.
-1 if no match is found.class Solution:
def strStr(self, haystack: str, needle: str) -> int:
if not needle:
return 0
base1, mod1 = 31, 768258391
base2, mod2 = 37, 685683731
n, m = len(haystack), len(needle)
if m > n:
return -1
power1, power2 = 1, 1
for _ in range(m):
power1 = (power1 * base1) % mod1
power2 = (power2 * base2) % mod2
needle_hash1, needle_hash2 = 0, 0
haystack_hash1, haystack_hash2 = 0, 0
for i in range(m):
needle_hash1 = (needle_hash1 * base1 + ord(needle[i])) % mod1
needle_hash2 = (needle_hash2 * base2 + ord(needle[i])) % mod2
haystack_hash1 = (haystack_hash1 * base1 + ord(haystack[i])) % mod1
haystack_hash2 = (haystack_hash2 * base2 + ord(haystack[i])) % mod2
for i in range(n - m + 1):
if haystack_hash1 == needle_hash1 and haystack_hash2 == needle_hash2:
return i
if i + m < n:
haystack_hash1 = (haystack_hash1 * base1 - ord(haystack[i]) * power1 + ord(haystack[i + m])) % mod1
haystack_hash2 = (haystack_hash2 * base2 - ord(haystack[i]) * power2 + ord(haystack[i + m])) % mod2
haystack_hash1 = (haystack_hash1 + mod1) % mod1
haystack_hash2 = (haystack_hash2 + mod2) % mod2
return -1Where is the length of the string and is the length of the string .
When iterating through the haystack, the loop should run from 0 to n - m inclusive. A common mistake is using n - m - 1 or n - 1 as the upper bound. The former misses the case where the needle appears at the very end, while the latter causes out-of-bounds access when comparing characters.
When the needle is an empty string, the expected return value is 0 (the empty string is found at the beginning of any string). Forgetting to check for this edge case can lead to incorrect behavior or infinite loops in some implementations.