Binary Search Trees - Understanding BST operations for an optimized collision resolution approach
Bit Manipulation - Using bitwise operations (OR, AND, XOR) for space-efficient storage
1. Brute Force
Intuition
The simplest implementation uses a dynamic array to store all keys. For each operation, we search through the array linearly. This works correctly but is inefficient since every operation requires scanning potentially all stored elements.
Algorithm
Initialize an empty array data.
For add(key): If the key is not already in the array, append it.
For remove(key): If the key exists in the array, remove it.
For contains(key): Return true if the key exists in the array.
classMyHashSet{privatevar data:[Int]init(){
data =[]}funcadd(_ key:Int){if!contains(key){
data.append(key)}}funcremove(_ key:Int){iflet index = data.firstIndex(of: key){
data.remove(at: index)}}funccontains(_ key:Int)->Bool{return data.contains(key)}}
Time & Space Complexity
Time complexity: O(n) for each function call.
Space complexity: O(n)
2. Boolean Array
Intuition
Since keys are constrained to [0, 1000000], we can use direct addressing with a boolean array. The index represents the key, and the boolean value indicates presence. This provides O(1) operations but uses fixed memory regardless of how many keys are stored.
Algorithm
Initialize a boolean array of size 1000001, all set to false.
Space complexity: O(1000000) since the key is in the range [0,1000000].
3. Linked List
Intuition
To reduce memory while handling collisions, we use separate chaining. An array of buckets stores linked lists, and keys are assigned to buckets using a hash function. Each operation traverses only the linked list in the relevant bucket, making average-case operations faster than the brute force approach.
Algorithm
Initialize an array of 10000 buckets, each with a dummy head node.
Define hash(key) as key % 10000.
For add(key): Traverse the list at hash(key). If the key already exists, return. Otherwise, append a new node with the key.
For remove(key): Traverse the list at hash(key). If a node with the matching key is found, remove it by updating the previous node's next pointer.
For contains(key): Traverse the list at hash(key). Return true if the key is found, false otherwise.
publicclassListNode{publicint Key;publicListNode Next;publicListNode(int key){
Key = key;
Next =null;}}publicclassMyHashSet{privateListNode[]set;publicMyHashSet(){set=newListNode[10000];for(int i =0; i <set.Length; i++){set[i]=newListNode(0);// Dummy head}}publicvoidAdd(int key){ListNode cur =set[key %set.Length];while(cur.Next !=null){if(cur.Next.Key == key)return;
cur = cur.Next;}
cur.Next =newListNode(key);}publicvoidRemove(int key){ListNode cur =set[key %set.Length];while(cur.Next !=null){if(cur.Next.Key == key){
cur.Next = cur.Next.Next;return;}
cur = cur.Next;}}publicboolContains(int key){ListNode cur =set[key %set.Length];while(cur.Next !=null){if(cur.Next.Key == key)returntrue;
cur = cur.Next;}returnfalse;}}
type ListNode struct{
key int
next *ListNode
}type MyHashSet struct{
set []*ListNode
}funcConstructor() MyHashSet {
set :=make([]*ListNode,10000)for i :=range set {
set[i]=&ListNode{key:0}}return MyHashSet{set: set}}func(this *MyHashSet)hash(key int)int{return key %len(this.set)}func(this *MyHashSet)Add(key int){
cur := this.set[this.hash(key)]for cur.next !=nil{if cur.next.key == key {return}
cur = cur.next
}
cur.next =&ListNode{key: key}}func(this *MyHashSet)Remove(key int){
cur := this.set[this.hash(key)]for cur.next !=nil{if cur.next.key == key {
cur.next = cur.next.next
return}
cur = cur.next
}}func(this *MyHashSet)Contains(key int)bool{
cur := this.set[this.hash(key)]for cur.next !=nil{if cur.next.key == key {returntrue}
cur = cur.next
}returnfalse}
classListNode(var key: Int,var next: ListNode?=null)classMyHashSet(){privatevalset=Array(10000){ListNode(0)}privatefunhash(key: Int): Int = key %set.size
funadd(key: Int){var cur =set[hash(key)]while(cur.next !=null){if(cur.next!!.key == key)return
cur = cur.next!!}
cur.next =ListNode(key)}funremove(key: Int){var cur =set[hash(key)]while(cur.next !=null){if(cur.next!!.key == key){
cur.next = cur.next!!.next
return}
cur = cur.next!!}}funcontains(key: Int): Boolean {var cur =set[hash(key)]while(cur.next !=null){if(cur.next!!.key == key)returntrue
cur = cur.next!!}returnfalse}}
classListNode{var key:Intvar next:ListNode?init(_ key:Int){self.key = key
self.next =nil}}classMyHashSet{privatevarset:[ListNode]init(){set=(0..<10000).map {_inListNode(0)}}privatefunchash(_ key:Int)->Int{return key %set.count
}funcadd(_ key:Int){var cur =set[hash(key)]while cur.next !=nil{if cur.next!.key == key {return}
cur = cur.next!}
cur.next =ListNode(key)}funcremove(_ key:Int){var cur =set[hash(key)]while cur.next !=nil{if cur.next!.key == key {
cur.next = cur.next!.next
return}
cur = cur.next!}}funccontains(_ key:Int)->Bool{var cur =set[hash(key)]while cur.next !=nil{if cur.next!.key == key {returntrue}
cur = cur.next!}returnfalse}}
Time & Space Complexity
Time complexity: O(kn) for each function call.
Space complexity: O(k+m)
Where n is the number of keys, k is the size of the set (10000) and m is the number of unique keys.
4. Binary Search Tree
Intuition
Instead of linked lists for collision handling, we can use binary search trees (BSTs) in each bucket. This improves the worst-case time complexity from O(n/k) to O(log(n/k)) for each bucket, since BST operations are logarithmic in the number of nodes. The tradeoff is slightly more complex implementation.
Algorithm
Initialize an array of 10000 buckets, each containing an empty BST.
Define hash(key) as key % 10000.
For add(key): If the key is not already in the BST at hash(key), insert it using standard BST insertion.
For remove(key): Delete the key from the BST at hash(key) using standard BST deletion (finding in-order successor when needed).
For contains(key): Search the BST at hash(key) and return true if the key is found.
Time complexity: O(log(kn)) in average case, O(kn) in worst case for each function call.
Space complexity: O(k+m)
Where n is the number of keys, k is the size of the set (10000) and m is the number of unique keys.
5. Bit Manipulation
Intuition
We can compress the boolean array approach by using individual bits instead of booleans. Each integer stores 32 bits, so we need only about 31251 integers to cover 1000000+ keys. We use bit operations to set, clear, and check individual bits. This reduces memory usage by a factor of 32 compared to a boolean array.
Algorithm
Initialize an integer array of size 31251 (since 31251 * 32 = 1000032 covers all keys).
Define getMask(key) as 1 << (key % 32) to create a bitmask for the key's position within its integer.
For add(key): Set the bit using set[key / 32] |= getMask(key).
For remove(key): If the key exists, toggle the bit using set[key / 32] ^= getMask(key).
For contains(key): Return true if set[key / 32] & getMask(key) is non-zero.
publicclassMyHashSet{privateint[]set;publicMyHashSet(){// key is in the range [1, 1000000]// 31251 * 32 = 1000032set=newint[31251];}publicvoidAdd(int key){set[key /32]|=GetMask(key);}publicvoidRemove(int key){if(Contains(key)){set[key /32]^=GetMask(key);}}publicboolContains(int key){return(set[key /32]&GetMask(key))!=0;}privateintGetMask(int key){return1<<(key %32);}}
type MyHashSet struct{
set []int}funcConstructor() MyHashSet {// key is in the range [1, 1000000]// 31251 * 32 = 1000032return MyHashSet{set:make([]int,31251)}}func(this *MyHashSet)getMask(key int)int{return1<<(key %32)}func(this *MyHashSet)Add(key int){
this.set[key/32]|= this.getMask(key)}func(this *MyHashSet)Remove(key int){if this.Contains(key){
this.set[key/32]^= this.getMask(key)}}func(this *MyHashSet)Contains(key int)bool{return(this.set[key/32]& this.getMask(key))!=0}
classMyHashSet(){// key is in the range [1, 1000000]// 31251 * 32 = 1000032privatevalset=IntArray(31251)privatefungetMask(key: Int): Int =1shl(key %32)funadd(key: Int){set[key /32]=set[key /32]orgetMask(key)}funremove(key: Int){if(contains(key)){set[key /32]=set[key /32]xorgetMask(key)}}funcontains(key: Int): Boolean {return(set[key /32]andgetMask(key))!=0}}
classMyHashSet{// key is in the range [1, 1000000]// 31251 * 32 = 1000032privatevarset:[Int]init(){set=[Int](repeating:0, count:31251)}privatefuncgetMask(_ key:Int)->Int{return1<<(key %32)}funcadd(_ key:Int){set[key /32]|=getMask(key)}funcremove(_ key:Int){ifcontains(key){set[key /32]^=getMask(key)}}funccontains(_ key:Int)->Bool{return(set[key /32]&getMask(key))!=0}}
Time & Space Complexity
Time complexity: O(1) for each function call.
Space complexity: O(k)
Where k is the size of the set (31251).
Common Pitfalls
Adding Duplicate Keys
The add() operation should be idempotent - adding an existing key should have no effect. A common mistake is not checking for duplicates before insertion.
# Wrong - allows duplicatesdefadd(self, key:int)->None:
cur = self.set[key %len(self.set)]while cur.next:
cur = cur.next
cur.next= ListNode(key)# Always adds, even if exists!# Correct - check for existence firstdefadd(self, key:int)->None:
cur = self.set[key %len(self.set)]while cur.next:if cur.next.key == key:return# Already exists
cur = cur.next
cur.next= ListNode(key)
Using XOR for Remove Without Checking Existence
In the bit manipulation approach, using XOR to remove a key that doesn't exist will incorrectly add it instead.
# Wrong - XOR toggles the bit regardlessdefremove(self, key:int)->None:
self.set[key //32]^= self.getMask(key)# Adds if not present!# Correct - only toggle if presentdefremove(self, key:int)->None:if self.contains(key):
self.set[key //32]^= self.getMask(key)