Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest of the elements are emails representing emails of the account.
Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.
After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.
Example 1:
Input: accounts = [
["neet","neet@gmail.com","neet_dsa@gmail.com"],
["alice","alice@gmail.com"],
["neet","bob@gmail.com","neet@gmail.com"],
["neet","neetcode@gmail.com"]
]
Output: [["neet","bob@gmail.com","neet@gmail.com","neet_dsa@gmail.com"],["alice","alice@gmail.com"],["neet","neetcode@gmail.com"]]Example 2:
Input: accounts = [
["James","james@mail.com"],
["James","james@mail.co"]
]
Output: [["James","james@mail.com"],["James","james@mail.co"]]Constraints:
1 <= accounts.length <= 10002 <= accounts[i].length <= 101 <= accounts[i][j].length <= 30accounts[i][0] consists of English letters.accounts[i][j] (for j > 0) is a valid email.Before attempting this problem, you should be comfortable with:
This is a graph connectivity problem in disguise. If two accounts share an email, they belong to the same person and should be merged. We can model this as a graph where emails are nodes, and emails within the same account are connected by edges. Finding all emails belonging to one person becomes finding all nodes in a connected component. dfs naturally explores an entire component, collecting all connected emails.
dfs to collect all emails in that connected component.class Solution:
def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
n = len(accounts)
emailIdx = {} # email -> id
emails = [] # set of emails of all accounts
emailToAcc = {} # email_index -> account_Id
m = 0
for accId, a in enumerate(accounts):
for i in range(1, len(a)):
email = a[i]
if email in emailIdx:
continue
emails.append(email)
emailIdx[email] = m
emailToAcc[m] = accId
m += 1
adj = [[] for _ in range(m)]
for a in accounts:
for i in range(2, len(a)):
id1 = emailIdx[a[i]]
id2 = emailIdx[a[i - 1]]
adj[id1].append(id2)
adj[id2].append(id1)
emailGroup = defaultdict(list) # index of acc -> list of emails
visited = [False] * m
def dfs(node, accId):
visited[node] = True
emailGroup[accId].append(emails[node])
for nei in adj[node]:
if not visited[nei]:
dfs(nei, accId)
for i in range(m):
if not visited[i]:
dfs(i, emailToAcc[i])
res = []
for accId in emailGroup:
name = accounts[accId][0]
res.append([name] + sorted(emailGroup[accId]))
return resWhere is the number of accounts and is the number of emails.
bfs provides an alternative way to explore connected components. Starting from any unvisited email, we use a queue to visit all reachable emails level by level. Each email we dequeue gets added to the current component, and its unvisited neighbors are enqueued. The result is the same as dfs, but bfs uses iteration with a queue instead of recursion.
bfs:class Solution:
def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
n = len(accounts)
emailIdx = {} # email -> id
emails = [] # set of emails of all accounts
emailToAcc = {} # email_index -> account_Id
m = 0
for accId, a in enumerate(accounts):
for i in range(1, len(a)):
email = a[i]
if email in emailIdx:
continue
emails.append(email)
emailIdx[email] = m
emailToAcc[m] = accId
m += 1
adj = [[] for _ in range(m)]
for a in accounts:
for i in range(2, len(a)):
id1 = emailIdx[a[i]]
id2 = emailIdx[a[i - 1]]
adj[id1].append(id2)
adj[id2].append(id1)
emailGroup = defaultdict(list) # index of acc -> list of emails
visited = [False] * m
def bfs(start, accId):
queue = deque([start])
visited[start] = True
while queue:
node = queue.popleft()
emailGroup[accId].append(emails[node])
for nei in adj[node]:
if not visited[nei]:
visited[nei] = True
queue.append(nei)
for i in range(m):
if not visited[i]:
bfs(i, emailToAcc[i])
res = []
for accId in emailGroup:
name = accounts[accId][0]
res.append([name] + sorted(emailGroup[accId]))
return resWhere is the number of accounts and is the number of emails.
Union-Find (Disjoint Set Union) is designed for exactly this type of problem: grouping elements into disjoint sets and merging sets efficiently. Instead of building a graph and traversing it, we assign each account an ID and union accounts that share an email. When we see an email for the first time, we record which account it belongs to. If we see it again, we union the current account with the one that first owned it. After processing all accounts, we group emails by their account's root representative.
root.class UnionFind:
def __init__(self, n):
self.par = [i for i in range(n)]
self.rank = [1] * n
def find(self, x):
while x != self.par[x]:
self.par[x] = self.par[self.par[x]]
x = self.par[x]
return x
def union(self, x1, x2):
p1, p2 = self.find(x1), self.find(x2)
if p1 == p2:
return False
if self.rank[p1] > self.rank[p2]:
self.par[p2] = p1
self.rank[p1] += self.rank[p2]
else:
self.par[p1] = p2
self.rank[p2] += self.rank[p1]
return True
class Solution:
def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
uf = UnionFind(len(accounts))
emailToAcc = {} # email -> index of acc
for i, a in enumerate(accounts):
for e in a[1:]:
if e in emailToAcc:
uf.union(i, emailToAcc[e])
else:
emailToAcc[e] = i
emailGroup = defaultdict(list) # index of acc -> list of emails
for e, i in emailToAcc.items():
leader = uf.find(i)
emailGroup[leader].append(e)
res = []
for i, emails in emailGroup.items():
name = accounts[i][0]
res.append([name] + sorted(emailGroup[i]))
return resWhere is the number of accounts and is the number of emails.
Two accounts with the same name are NOT necessarily the same person. They are only the same person if they share at least one email. Merging accounts based solely on name will produce incorrect results.
# Wrong: merging by name
if accounts[i][0] == accounts[j][0]:
merge(i, j)
# Correct: merge only when emails overlap
if email in emailToAccount:
union(i, emailToAccount[email])The problem requires emails in each merged account to be sorted in lexicographical order. Forgetting this step will produce results in arbitrary order that may fail validation.
When merging accounts, you must use the name from one of the original accounts in the merged group. A common mistake is losing track of which account's name to use, especially in Union-Find where you need to get the name from the representative's original account.