187. Repeated DNA Sequences - Explanation

Prerequisites

Before attempting this problem, you should be comfortable with:

Sliding Window - Maintaining a fixed-size window that slides across the string to extract substrings
Hash Set / Hash Map - Tracking seen sequences and counting occurrences for duplicate detection
String Manipulation - Extracting substrings efficiently for comparison and storage

1. Hash Set

Intuition

We need to find all 10-letter sequences that appear more than once. A hash set naturally tracks what we have seen before. As we slide a window of length 10 across the string with index l, we check if the current substring cur was already encountered. If so, it is a repeated sequence. Using two sets (one for seen sequences and one for res results) avoids adding duplicates to our answer.

Algorithm

Initialize two sets: seen to track encountered sequences and res to store repeated ones.
Iterate through the string with a sliding window of size 10 using index l.
For each position, extract the 10-character substring cur.
If it exists in seen, add it to res.
Add the current substring to seen.
Return res as a list.

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        seen, res = set(), set()

        for l in range(len(s) - 9):
            cur = s[l: l + 10]
            if cur in seen:
                res.add(cur)
            seen.add(cur)
        return list(res)

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Set<String> seen = new HashSet<>();
        Set<String> res = new HashSet<>();

        for (int l = 0; l < s.length() - 9; l++) {
            String cur = s.substring(l, l + 10);
            if (seen.contains(cur)) {
                res.add(cur);
            }
            seen.add(cur);
        }
        return new ArrayList<>(res);
    }
}

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        if (s.size() < 10) return {};
        unordered_set<string> seen, res;

        for (int l = 0; l < s.size() - 9; l++) {
            string cur = s.substr(l, 10);
            if (seen.count(cur)) {
                res.insert(cur);
            }
            seen.insert(cur);
        }
        return vector<string>(res.begin(), res.end());
    }
};

class Solution {
    /**
     * @param {string} s
     * @return {string[]}
     */
    findRepeatedDnaSequences(s) {
        const seen = new Set();
        const res = new Set();

        for (let l = 0; l < s.length - 9; l++) {
            const cur = s.substring(l, l + 10);
            if (seen.has(cur)) {
                res.add(cur);
            }
            seen.add(cur);
        }
        return Array.from(res);
    }
}

public class Solution {
    public IList<string> FindRepeatedDnaSequences(string s) {
        var seen = new HashSet<string>();
        var res = new HashSet<string>();

        for (int l = 0; l < s.Length - 9; l++) {
            string cur = s.Substring(l, 10);
            if (seen.Contains(cur)) {
                res.Add(cur);
            }
            seen.Add(cur);
        }
        return res.ToList();
    }
}

func findRepeatedDnaSequences(s string) []string {
    seen := make(map[string]bool)
    res := make(map[string]bool)

    for l := 0; l < len(s)-9; l++ {
        cur := s[l : l+10]
        if seen[cur] {
            res[cur] = true
        }
        seen[cur] = true
    }

    result := []string{}
    for k := range res {
        result = append(result, k)
    }
    return result
}

class Solution {
    fun findRepeatedDnaSequences(s: String): List<String> {
        val seen = HashSet<String>()
        val res = HashSet<String>()

        for (l in 0 until s.length - 9) {
            val cur = s.substring(l, l + 10)
            if (cur in seen) {
                res.add(cur)
            }
            seen.add(cur)
        }
        return res.toList()
    }
}

class Solution {
    func findRepeatedDnaSequences(_ s: String) -> [String] {
        var seen = Set<String>()
        var res = Set<String>()
        let chars = Array(s)

        for l in 0..<(chars.count - 9) {
            let cur = String(chars[l..<(l + 10)])
            if seen.contains(cur) {
                res.insert(cur)
            }
            seen.insert(cur)
        }
        return Array(res)
    }
}

impl Solution {
    pub fn find_repeated_dna_sequences(s: String) -> Vec<String> {
        let s = s.as_bytes();
        let mut seen = HashSet::new();
        let mut res = HashSet::new();

        if s.len() < 10 {
            return vec![];
        }

        for l in 0..=s.len() - 10 {
            let cur = &s[l..l + 10];
            if !seen.insert(cur) {
                res.insert(cur);
            }
        }

        res.into_iter()
            .map(|b| String::from_utf8_lossy(b).into_owned())
            .collect()
    }
}

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

2. Hash Map

Intuition

Instead of using two sets, we can use a hash map mp to count occurrences of each sequence. This lets us add a sequence to the res exactly when its count reaches 2, ensuring we only report it once regardless of how many additional times it appears.

Algorithm

If the string length is less than 10, return an empty list.
Initialize a hash map mp to count occurrences and a result list res.
Slide a window of size 10 across the string using index l.
For each substring cur, increment its count in mp.
When the count becomes exactly 2, add the substring to res.
Return the res list.

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        if len(s) < 10:
            return []

        mp = defaultdict(int)
        res = []
        for l in range(len(s) - 9):
            cur = s[l: l + 10]
            mp[cur] += 1
            if mp[cur] == 2:
                res.append(cur)

        return res

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        if (s.length() < 10) {
            return new ArrayList<>();
        }

        Map<String, Integer> mp = new HashMap<>();
        List<String> res = new ArrayList<>();

        for (int l = 0; l < s.length() - 9; l++) {
            String cur = s.substring(l, l + 10);
            mp.put(cur, mp.getOrDefault(cur, 0) + 1);
            if (mp.get(cur) == 2) {
                res.add(cur);
            }
        }

        return res;
    }
}

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        if (s.size() < 10) {
            return {};
        }

        unordered_map<string, int> mp;
        vector<string> res;

        for (int l = 0; l < s.size() - 9; l++) {
            string cur = s.substr(l, 10);
            mp[cur]++;
            if (mp[cur] == 2) {
                res.push_back(cur);
            }
        }

        return res;
    }
};

class Solution {
    /**
     * @param {string} s
     * @return {string[]}
     */
    findRepeatedDnaSequences(s) {
        if (s.length < 10) {
            return [];
        }

        const mp = new Map();
        const res = [];

        for (let l = 0; l <= s.length - 10; l++) {
            const cur = s.substring(l, l + 10);
            mp.set(cur, (mp.get(cur) || 0) + 1);
            if (mp.get(cur) === 2) {
                res.push(cur);
            }
        }

        return res;
    }
}

public class Solution {
    public IList<string> FindRepeatedDnaSequences(string s) {
        if (s.Length < 10) {
            return new List<string>();
        }

        var mp = new Dictionary<string, int>();
        var res = new List<string>();

        for (int l = 0; l <= s.Length - 10; l++) {
            string cur = s.Substring(l, 10);
            if (!mp.ContainsKey(cur)) mp[cur] = 0;
            mp[cur]++;
            if (mp[cur] == 2) {
                res.Add(cur);
            }
        }

        return res;
    }
}

func findRepeatedDnaSequences(s string) []string {
    if len(s) < 10 {
        return []string{}
    }

    mp := make(map[string]int)
    res := []string{}

    for l := 0; l <= len(s)-10; l++ {
        cur := s[l : l+10]
        mp[cur]++
        if mp[cur] == 2 {
            res = append(res, cur)
        }
    }

    return res
}

class Solution {
    fun findRepeatedDnaSequences(s: String): List<String> {
        if (s.length < 10) {
            return emptyList()
        }

        val mp = HashMap<String, Int>()
        val res = mutableListOf<String>()

        for (l in 0..s.length - 10) {
            val cur = s.substring(l, l + 10)
            mp[cur] = mp.getOrDefault(cur, 0) + 1
            if (mp[cur] == 2) {
                res.add(cur)
            }
        }

        return res
    }
}

class Solution {
    func findRepeatedDnaSequences(_ s: String) -> [String] {
        if s.count < 10 {
            return []
        }

        var mp = [String: Int]()
        var res = [String]()
        let chars = Array(s)

        for l in 0...(chars.count - 10) {
            let cur = String(chars[l..<(l + 10)])
            mp[cur, default: 0] += 1
            if mp[cur] == 2 {
                res.append(cur)
            }
        }

        return res
    }
}

impl Solution {
    pub fn find_repeated_dna_sequences(s: String) -> Vec<String> {
        let s = s.as_bytes();
        if s.len() < 10 {
            return vec![];
        }

        let mut mp = HashMap::new();
        let mut res = Vec::new();

        for l in 0..=s.len() - 10 {
            let cur = &s[l..l + 10];
            let count = mp.entry(cur).or_insert(0);
            *count += 1;
            if *count == 2 {
                res.push(String::from_utf8_lossy(cur).into_owned());
            }
        }

        res
    }
}

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

3. Rabin-Karp Algorithm (Double Hashing)

Intuition

Storing full 10-character strings as keys can be memory intensive. The Rabin-Karp algorithm computes a rolling hash for each window, allowing us to represent each sequence as a number instead. Double hashing (using two different hash bases) minimizes collision probability, making numeric comparisons reliable. As the window slides with index i, we efficiently update the hashes hash1 and hash2 by removing the contribution of the outgoing character and adding the incoming one.

Algorithm

If the string length is less than 10, return an empty list.
Precompute the power values power1 and power2 for both hash bases (raised to the 9th power for the leading digit).
Initialize two rolling hashes hash1 and hash2 to zero.
Iterate through the string with index i, updating both hashes for each character.
Once the window reaches size 10, combine both hashes into a single key.
Use a hash map cnt to count occurrences of each key.
When a key appears the second time, add the corresponding substring to res.
Slide the window by subtracting the outgoing character's hash contribution.
Return the res list.

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        n = len(s)
        if n < 10:
            return []

        cnt = defaultdict(int)
        res = []
        base1, base2 = 31, 37
        hash1 = hash2 = 0
        power1, power2 = 1, 1
        MOD1, MOD2 = 685683731, 768258391

        for i in range(9):
            power1 *= base1
            power2 *= base2
            power1 %= MOD1
            power2 %= MOD2

        for i in range(n):
            hash1 = (hash1 * base1 + ord(s[i])) % MOD1
            hash2 = (hash2 * base2 + ord(s[i])) % MOD2

            if i >= 9:
                key = (hash1 << 31) | hash2
                cnt[key] += 1
                if cnt[key] == 2:
                    res.append(s[i - 9 : i + 1])

                hash1 = (hash1 - power1 * ord(s[i - 9]) % MOD1 + MOD1) % MOD1
                hash2 = (hash2 - power2 * ord(s[i - 9]) % MOD2 + MOD2)

        return res

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        int n = s.length();
        if (n < 10) return new ArrayList<>();

        Map<Long, Integer> cnt = new HashMap<>();
        List<String> res = new ArrayList<>();
        int base1 = 31, base2 = 37;
        long hash1 = 0, hash2 = 0, power1 = 1, power2 = 1;
        int MOD1 = 685683731, MOD2 = 768258391;

        for (int i = 0; i < 9; i++) {
            power1 = (power1 * base1) % MOD1;
            power2 = (power2 * base2) % MOD2;
        }

        for (int i = 0; i < n; i++) {
            hash1 = (hash1 * base1 + s.charAt(i)) % MOD1;
            hash2 = (hash2 * base2 + s.charAt(i)) % MOD2;

            if (i >= 9) {
                long key = (hash1 << 31) | hash2;
                cnt.put(key, cnt.getOrDefault(key, 0) + 1);
                if (cnt.get(key) == 2) {
                    res.add(s.substring(i - 9, i + 1));
                }

                hash1 = (hash1 - power1 * s.charAt(i - 9) % MOD1 + MOD1) % MOD1;
                hash2 = (hash2 - power2 * s.charAt(i - 9) % MOD2 + MOD2) % MOD2;
            }
        }

        return res;
    }
}

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        int n = s.length();
        if (n < 10) return {};

        unordered_map<long long, int> cnt;
        vector<string> res;
        int base1 = 31, base2 = 37;
        long long hash1 = 0, hash2 = 0, power1 = 1, power2 = 1;
        int MOD1 = 685683731, MOD2 = 768258391;

        for (int i = 0; i < 9; i++) {
            power1 = (power1 * base1) % MOD1;
            power2 = (power2 * base2) % MOD2;
        }

        for (int i = 0; i < n; i++) {
            hash1 = (hash1 * base1 + s[i]) % MOD1;
            hash2 = (hash2 * base2 + s[i]) % MOD2;

            if (i >= 9) {
                long long key = (hash1 << 31) | hash2;
                cnt[key]++;
                if (cnt[key] == 2) {
                    res.push_back(s.substr(i - 9, 10));
                }

                hash1 = (hash1 - power1 * s[i - 9] % MOD1 + MOD1) % MOD1;
                hash2 = (hash2 - power2 * s[i - 9] % MOD2 + MOD2) % MOD2;
            }
        }

        return res;
    }
};

class Solution {
    /**
     * @param {string} s
     * @return {string[]}
     */
    findRepeatedDnaSequences(s) {
        const n = s.length;
        if (n < 10) return [];

        const cnt = new Map();
        const res = [];
        const base1 = 31,
            base2 = 37;
        let hash1 = 0,
            hash2 = 0,
            power1 = 1,
            power2 = 1;
        const MOD1 = 685683731,
            MOD2 = 768258391;

        for (let i = 0; i < 9; i++) {
            power1 = (power1 * base1) % MOD1;
            power2 = (power2 * base2) % MOD2;
        }

        for (let i = 0; i < n; i++) {
            hash1 = (hash1 * base1 + s.charCodeAt(i)) % MOD1;
            hash2 = (hash2 * base2 + s.charCodeAt(i)) % MOD2;

            if (i >= 9) {
                const key = `${hash1},${hash2}`;
                cnt.set(key, (cnt.get(key) || 0) + 1);
                if (cnt.get(key) === 2) {
                    res.push(s.substring(i - 9, i + 1));
                }

                hash1 =
                    (hash1 - ((power1 * s.charCodeAt(i - 9)) % MOD1) + MOD1) %
                    MOD1;
                hash2 =
                    (hash2 - ((power2 * s.charCodeAt(i - 9)) % MOD2) + MOD2) %
                    MOD2;
            }
        }

        return res;
    }
}

public class Solution {
    public IList<string> FindRepeatedDnaSequences(string s) {
        int n = s.Length;
        if (n < 10) return new List<string>();

        var cnt = new Dictionary<long, int>();
        var res = new List<string>();
        int base1 = 31, base2 = 37;
        long hash1 = 0, hash2 = 0, power1 = 1, power2 = 1;
        int MOD1 = 685683731, MOD2 = 768258391;

        for (int i = 0; i < 9; i++) {
            power1 = (power1 * base1) % MOD1;
            power2 = (power2 * base2) % MOD2;
        }

        for (int i = 0; i < n; i++) {
            hash1 = (hash1 * base1 + s[i]) % MOD1;
            hash2 = (hash2 * base2 + s[i]) % MOD2;

            if (i >= 9) {
                long key = (hash1 << 31) | hash2;
                if (!cnt.ContainsKey(key)) cnt[key] = 0;
                cnt[key]++;
                if (cnt[key] == 2) {
                    res.Add(s.Substring(i - 9, 10));
                }

                hash1 = (hash1 - power1 * s[i - 9] % MOD1 + MOD1) % MOD1;
                hash2 = (hash2 - power2 * s[i - 9] % MOD2 + MOD2) % MOD2;
            }
        }

        return res;
    }
}

func findRepeatedDnaSequences(s string) []string {
    n := len(s)
    if n < 10 {
        return []string{}
    }

    cnt := make(map[int64]int)
    res := []string{}
    base1, base2 := int64(31), int64(37)
    var hash1, hash2, power1, power2 int64 = 0, 0, 1, 1
    MOD1, MOD2 := int64(685683731), int64(768258391)

    for i := 0; i < 9; i++ {
        power1 = (power1 * base1) % MOD1
        power2 = (power2 * base2) % MOD2
    }

    for i := 0; i < n; i++ {
        hash1 = (hash1*base1 + int64(s[i])) % MOD1
        hash2 = (hash2*base2 + int64(s[i])) % MOD2

        if i >= 9 {
            key := (hash1 << 31) | hash2
            cnt[key]++
            if cnt[key] == 2 {
                res = append(res, s[i-9:i+1])
            }

            hash1 = (hash1 - power1*int64(s[i-9])%MOD1 + MOD1) % MOD1
            hash2 = (hash2 - power2*int64(s[i-9])%MOD2 + MOD2) % MOD2
        }
    }

    return res
}

class Solution {
    fun findRepeatedDnaSequences(s: String): List<String> {
        val n = s.length
        if (n < 10) return emptyList()

        val cnt = HashMap<Long, Int>()
        val res = mutableListOf<String>()
        val base1 = 31L
        val base2 = 37L
        var hash1 = 0L
        var hash2 = 0L
        var power1 = 1L
        var power2 = 1L
        val MOD1 = 685683731L
        val MOD2 = 768258391L

        for (i in 0 until 9) {
            power1 = (power1 * base1) % MOD1
            power2 = (power2 * base2) % MOD2
        }

        for (i in 0 until n) {
            hash1 = (hash1 * base1 + s[i].code) % MOD1
            hash2 = (hash2 * base2 + s[i].code) % MOD2

            if (i >= 9) {
                val key = (hash1 shl 31) or hash2
                cnt[key] = cnt.getOrDefault(key, 0) + 1
                if (cnt[key] == 2) {
                    res.add(s.substring(i - 9, i + 1))
                }

                hash1 = (hash1 - power1 * s[i - 9].code % MOD1 + MOD1) % MOD1
                hash2 = (hash2 - power2 * s[i - 9].code % MOD2 + MOD2) % MOD2
            }
        }

        return res
    }
}

class Solution {
    func findRepeatedDnaSequences(_ s: String) -> [String] {
        let chars = Array(s)
        let n = chars.count
        if n < 10 { return [] }

        var cnt = [Int64: Int]()
        var res = [String]()
        let base1: Int64 = 31
        let base2: Int64 = 37
        var hash1: Int64 = 0
        var hash2: Int64 = 0
        var power1: Int64 = 1
        var power2: Int64 = 1
        let MOD1: Int64 = 685683731
        let MOD2: Int64 = 768258391

        for _ in 0..<9 {
            power1 = (power1 * base1) % MOD1
            power2 = (power2 * base2) % MOD2
        }

        for i in 0..<n {
            hash1 = (hash1 * base1 + Int64(chars[i].asciiValue!)) % MOD1
            hash2 = (hash2 * base2 + Int64(chars[i].asciiValue!)) % MOD2

            if i >= 9 {
                let key = (hash1 << 31) | hash2
                cnt[key, default: 0] += 1
                if cnt[key] == 2 {
                    res.append(String(chars[(i - 9)...(i)]))
                }

                hash1 = (hash1 - power1 * Int64(chars[i - 9].asciiValue!) % MOD1 + MOD1) % MOD1
                hash2 = (hash2 - power2 * Int64(chars[i - 9].asciiValue!) % MOD2 + MOD2) % MOD2
            }
        }

        return res
    }
}

impl Solution {
    pub fn find_repeated_dna_sequences(s: String) -> Vec<String> {
        let s = s.as_bytes();
        let n = s.len();
        if n < 10 {
            return vec![];
        }

        let mut cnt: HashMap<i64, i32> = HashMap::new();
        let mut res = Vec::new();
        let base1: i64 = 31;
        let base2: i64 = 37;
        let mut hash1: i64 = 0;
        let mut hash2: i64 = 0;
        let mut power1: i64 = 1;
        let mut power2: i64 = 1;
        let mod1: i64 = 685683731;
        let mod2: i64 = 768258391;

        for _ in 0..9 {
            power1 = (power1 * base1) % mod1;
            power2 = (power2 * base2) % mod2;
        }

        for i in 0..n {
            hash1 = (hash1 * base1 + s[i] as i64) % mod1;
            hash2 = (hash2 * base2 + s[i] as i64) % mod2;

            if i >= 9 {
                let key = (hash1 << 31) | hash2;
                let count = cnt.entry(key).or_insert(0);
                *count += 1;
                if *count == 2 {
                    res.push(
                        String::from_utf8_lossy(&s[i - 9..=i]).into_owned(),
                    );
                }

                hash1 = (hash1 - power1 * s[i - 9] as i64 % mod1 + mod1) % mod1;
                hash2 = (hash2 - power2 * s[i - 9] as i64 % mod2 + mod2) % mod2;
            }
        }

        res
    }
}

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

4. Bit Mask

Intuition

DNA sequences use only four characters (A, C, G, T), each of which can be encoded with just 2 bits. A 10-character sequence therefore fits in 20 bits, well within a single integer. By treating each sequence as a mask, we avoid storing strings entirely and get fast integer operations for comparisons and hashing.

Algorithm

If the string length is less than 10, return an empty list.
Map each nucleotide to a 2-bit value: A=0, C=1, G=2, T=3 in map mp.
Initialize a mask to zero and a hash map cnt for counting.
For each character at index i, shift the mask left by 2 bits and OR with the character's value.
Mask the result with 0xFFFFF to keep only the rightmost 20 bits (10 characters).
Once the window reaches size 10, increment the count for this mask.
When a mask appears the second time, add the corresponding substring to res.
Return the res list.

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> list[str]:
        if len(s) < 10:
            return []

        mp = {'A': 0, 'C': 1, 'G': 2, 'T': 3}
        seen, res = set(), set()
        mask = 0
        for i in range(len(s)):
            mask = ((mask << 2) | mp[s[i]]) & 0xFFFFF
            if i >= 9:
                if mask in seen:
                    res.add(s[i - 9: i + 1])
                else:
                    seen.add(mask)

        return list(res)

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        if (s.length() < 10) return new ArrayList<>();

        Map<Character, Integer> mp = Map.of('A', 0, 'C', 1, 'G', 2, 'T', 3);
        Map<Integer, Integer> cnt = new HashMap<>();
        List<String> res = new ArrayList<>();
        int mask = 0;

        for (int i = 0; i < s.length(); i++) {
            mask = ((mask << 2) | mp.get(s.charAt(i))) & 0xFFFFF;
            if (i >= 9) {
                cnt.put(mask, cnt.getOrDefault(mask, 0) + 1);
                if (cnt.get(mask) == 2) {
                    res.add(s.substring(i - 9, i + 1));
                }
            }
        }

        return res;
    }
}

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        if (s.length() < 10) return {};

        unordered_map<char, int> mp = {{'A', 0}, {'C', 1},
                                       {'G', 2}, {'T', 3}};
        unordered_map<int, int> cnt;
        vector<string> res;
        int mask = 0;

        for (int i = 0; i < s.size(); i++) {
            mask = ((mask << 2) | mp[s[i]]) & 0xFFFFF;

            if (i >= 9) {
                cnt[mask]++;
                if (cnt[mask] == 2) {
                    res.push_back(s.substr(i - 9, 10));
                }
            }
        }

        return res;
    }
};

class Solution {
    /**
     * @param {string} s
     * @return {string[]}
     */
    findRepeatedDnaSequences(s) {
        if (s.length < 10) return [];

        const mp = { A: 0, C: 1, G: 2, T: 3 };
        const cnt = new Map();
        const res = [];
        let mask = 0;

        for (let i = 0; i < s.length; i++) {
            mask = ((mask << 2) | mp[s[i]]) & 0xfffff;

            if (i >= 9) {
                cnt.set(mask, (cnt.get(mask) || 0) + 1);
                if (cnt.get(mask) === 2) {
                    res.push(s.substring(i - 9, i + 1));
                }
            }
        }

        return res;
    }
}

public class Solution {
    public IList<string> FindRepeatedDnaSequences(string s) {
        if (s.Length < 10) return new List<string>();

        var mp = new Dictionary<char, int> {
            {'A', 0}, {'C', 1}, {'G', 2}, {'T', 3}
        };
        var cnt = new Dictionary<int, int>();
        var res = new List<string>();
        int mask = 0;

        for (int i = 0; i < s.Length; i++) {
            mask = ((mask << 2) | mp[s[i]]) & 0xFFFFF;

            if (i >= 9) {
                if (!cnt.ContainsKey(mask)) cnt[mask] = 0;
                cnt[mask]++;
                if (cnt[mask] == 2) {
                    res.Add(s.Substring(i - 9, 10));
                }
            }
        }

        return res;
    }
}

class Solution {
    fun findRepeatedDnaSequences(s: String): List<String> {
        if (s.length < 10) return emptyList()

        val mp = mapOf('A' to 0, 'C' to 1, 'G' to 2, 'T' to 3)
        val cnt = HashMap<Int, Int>()
        val res = mutableListOf<String>()
        var mask = 0

        for (i in s.indices) {
            mask = ((mask shl 2) or mp[s[i]]!!) and 0xFFFFF

            if (i >= 9) {
                cnt[mask] = cnt.getOrDefault(mask, 0) + 1
                if (cnt[mask] == 2) {
                    res.add(s.substring(i - 9, i + 1))
                }
            }
        }

        return res
    }
}

class Solution {
    func findRepeatedDnaSequences(_ s: String) -> [String] {
        let chars = Array(s)
        if chars.count < 10 { return [] }

        let mp: [Character: Int] = ["A": 0, "C": 1, "G": 2, "T": 3]
        var cnt = [Int: Int]()
        var res = [String]()
        var mask = 0

        for i in 0..<chars.count {
            mask = ((mask << 2) | mp[chars[i]]!) & 0xFFFFF

            if i >= 9 {
                cnt[mask, default: 0] += 1
                if cnt[mask] == 2 {
                    res.append(String(chars[(i - 9)...(i)]))
                }
            }
        }

        return res
    }
}

impl Solution {
    pub fn find_repeated_dna_sequences(s: String) -> Vec<String> {
        let s = s.as_bytes();
        if s.len() < 10 {
            return vec![];
        }

        let mp = |c: u8| -> i32 {
            match c {
                b'A' => 0,
                b'C' => 1,
                b'G' => 2,
                b'T' => 3,
                _ => 0,
            }
        };

        let mut cnt: HashMap<i32, i32> = HashMap::new();
        let mut res = Vec::new();
        let mut mask: i32 = 0;

        for i in 0..s.len() {
            mask = ((mask << 2) | mp(s[i])) & 0xFFFFF;

            if i >= 9 {
                let count = cnt.entry(mask).or_insert(0);
                *count += 1;
                if *count == 2 {
                    res.push(
                        String::from_utf8_lossy(&s[i - 9..=i]).into_owned(),
                    );
                }
            }
        }

        res
    }
}

Time & Space Complexity

Time complexity: $O(n)$
Space complexity: $O(n)$

Common Pitfalls

Adding Duplicates to the Result

When a sequence appears three or more times, it should only be added to the result once. Using a single set without tracking whether the sequence was already added to the result causes duplicates in the output.

Off-by-One Errors in Window Boundaries

The sliding window must extract exactly 10 characters. Common mistakes include using s[l:l+9] instead of s[l:l+10], or iterating with range(len(s) - 10) instead of range(len(s) - 9), which skips valid windows or causes index errors.

Incorrect Bit Mask Width in the Optimized Approach

When using bit manipulation, each nucleotide requires 2 bits, so a 10-character sequence needs 20 bits. Using the wrong mask value (such as 0xFFFF for 16 bits instead of 0xFFFFF for 20 bits) causes hash collisions between different sequences.