The most straightforward way to find duplicate subtrees is to compare every subtree with every other subtree. We first collect all nodes using DFS, then for each pair of nodes, we check if their subtrees are structurally identical with matching values. When we find a match, we add one representative to the result and mark both as seen to avoid duplicates.
(i, j) where j > i:# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def findDuplicateSubtrees(self, root: Optional[TreeNode]) -> List[Optional[TreeNode]]:
def same(node1, node2):
if not node1 and not node2:
return True
if not node1 or not node2:
return False
return (node1.val == node2.val and
same(node1.left, node2.left) and
same(node1.right, node2.right))
subTree = []
def dfs(root):
if not root:
return
subTree.append(root)
dfs(root.left)
dfs(root.right)
dfs(root)
res = []
seen = set()
for i in range(len(subTree)):
if subTree[i] in seen:
continue
for j in range(i + 1, len(subTree)):
if subTree[j] in seen:
continue
if same(subTree[i], subTree[j]):
if subTree[i] not in seen:
res.append(subTree[i])
seen.add(subTree[i])
seen.add(subTree[j])
return resInstead of comparing subtrees pairwise, we can serialize each subtree into a unique string representation. Two subtrees are identical if and only if they produce the same serialization. By storing these strings in a hash map, we can detect duplicates in a single pass through the tree. When we see the same serialization for the second time, we know we have found a duplicate.
"value,left_serialization,right_serialization". Use "null" for empty nodes.2), add the current node to the result.# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def findDuplicateSubtrees(self, root: Optional[TreeNode]) -> List[Optional[TreeNode]]:
subtrees = defaultdict(list)
res = []
def dfs(node):
if not node:
return "null"
s = ",".join([str(node.val), dfs(node.left), dfs(node.right)])
if len(subtrees[s]) == 1:
res.append(node)
subtrees[s].append(node)
return s
dfs(root)
return resThe serialization approach has quadratic space complexity because string concatenation creates long strings for large trees. We can optimize by assigning a unique integer ID to each distinct subtree structure. Instead of storing full serialization strings, we represent each subtree as a tuple of (left_id, value, right_id) and map this tuple to a unique ID. This keeps the key size constant regardless of subtree depth.
-1 for null nodes.(left_child_id, node_value, right_child_id).# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def findDuplicateSubtrees(self, root: Optional[TreeNode]) -> List[Optional[TreeNode]]:
id_map = {}
count = defaultdict(int)
res = []
def dfs(node):
if not node:
return -1
cur = (dfs(node.left), node.val, dfs(node.right))
if cur not in id_map:
id_map[cur] = len(id_map) + 1
curId = id_map[cur]
if count[curId] == 1:
res.append(node)
count[curId] += 1
return curId
dfs(root)
return resA serialization like "1,2,null" is ambiguous without structure markers. The value 2 could be a left child or right child. Always use a consistent format that distinguishes left from right, such as "value,left_serialization,right_serialization" with explicit null markers for missing children.
When the same subtree structure appears three or more times, you should only add one representative to the result list. Track how many times each serialization has been seen and only add to the result on the second occurrence, not subsequent ones.
Naive string serialization creates strings proportional to subtree size, leading to O(n^2) total space for all serializations. The optimized approach assigns unique integer IDs to each subtree structure, keeping the representation size constant regardless of subtree depth.