721. Accounts Merge - Explanation

Problem Link

Description

Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest of the elements are emails representing emails of the account.

Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.

After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.

Example 1:

Input: accounts = [
    ["neet","neet@gmail.com","neet_dsa@gmail.com"],
    ["alice","alice@gmail.com"],
    ["neet","bob@gmail.com","neet@gmail.com"],
    ["neet","neetcode@gmail.com"]
]

Output: [["neet","bob@gmail.com","neet@gmail.com","neet_dsa@gmail.com"],["alice","alice@gmail.com"],["neet","neetcode@gmail.com"]]

Example 2:

Input: accounts = [
    ["James","james@mail.com"],
    ["James","james@mail.co"]
]

Output: [["James","james@mail.com"],["James","james@mail.co"]]

Constraints:

  • 1 <= accounts.length <= 1000
  • 2 <= accounts[i].length <= 10
  • 1 <= accounts[i][j].length <= 30
  • accounts[i][0] consists of English letters.
  • accounts[i][j] (for j > 0) is a valid email.


Topics

Company Tags

Please upgrade to NeetCode Pro to view company tags.



Prerequisites

Before attempting this problem, you should be comfortable with:

  • Hash Maps - Using dictionaries to map keys to values and track relationships
  • Graph Representation - Building adjacency lists from problem constraints
  • DFS/BFS Traversal - Exploring all nodes in a connected component
  • Union-Find (Disjoint Set Union) - Understanding how to efficiently merge and query sets (for optimal solution)
  • Connected Components - Recognizing when a problem is about grouping related elements

Intuition

This is a graph connectivity problem in disguise. If two accounts share an email, they belong to the same person and should be merged. We can model this as a graph where emails are nodes, and emails within the same account are connected by edges. Finding all emails belonging to one person becomes finding all nodes in a connected component. dfs naturally explores an entire component, collecting all connected emails.

Algorithm

  1. Assign a unique index to each email and track which account it first appeared in.
  2. Build an adjacency list connecting consecutive emails within each account.
  3. For each unvisited email, run dfs to collect all emails in that connected component.
  4. Group the collected emails by the account index of the starting email.
  5. For each group, sort the emails and prepend the account name.
  6. Return the merged accounts.
class Solution:
    def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
        n = len(accounts)
        emailIdx = {} # email -> id
        emails = [] # set of emails of all accounts
        emailToAcc = {} # email_index -> account_Id

        m = 0
        for accId, a in enumerate(accounts):
            for i in range(1, len(a)):
                email = a[i]
                if email in emailIdx:
                    continue
                emails.append(email)
                emailIdx[email] = m
                emailToAcc[m] = accId
                m += 1

        adj = [[] for _ in range(m)]
        for a in accounts:
            for i in range(2, len(a)):
                id1 = emailIdx[a[i]]
                id2 = emailIdx[a[i - 1]]
                adj[id1].append(id2)
                adj[id2].append(id1)

        emailGroup = defaultdict(list) # index of acc -> list of emails
        visited = [False] * m
        def dfs(node, accId):
            visited[node] = True
            emailGroup[accId].append(emails[node])
            for nei in adj[node]:
                if not visited[nei]:
                    dfs(nei, accId)

        for i in range(m):
            if not visited[i]:
                dfs(i, emailToAcc[i])

        res = []
        for accId in emailGroup:
            name = accounts[accId][0]
            res.append([name] + sorted(emailGroup[accId]))

        return res

Time & Space Complexity

  • Time complexity: O((nm)log(nm))O((n * m)\log (n * m))
  • Space complexity: O(nm)O(n * m)

Where nn is the number of accounts and mm is the number of emails.


Intuition

bfs provides an alternative way to explore connected components. Starting from any unvisited email, we use a queue to visit all reachable emails level by level. Each email we dequeue gets added to the current component, and its unvisited neighbors are enqueued. The result is the same as dfs, but bfs uses iteration with a queue instead of recursion.

Algorithm

  1. Assign a unique index to each email and track which account it first appeared in.
  2. Build an adjacency list connecting consecutive emails within each account.
  3. For each unvisited email, start bfs:
    • Initialize a queue with the starting email and mark it visited.
    • While the queue is not empty, dequeue an email, add it to the current group, and enqueue its unvisited neighbors.
  4. Group the collected emails by the account index of the starting email.
  5. For each group, sort the emails and prepend the account name.
  6. Return the merged accounts.
class Solution:
    def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
        n = len(accounts)
        emailIdx = {} # email -> id
        emails = [] # set of emails of all accounts
        emailToAcc = {} # email_index -> account_Id

        m = 0
        for accId, a in enumerate(accounts):
            for i in range(1, len(a)):
                email = a[i]
                if email in emailIdx:
                    continue
                emails.append(email)
                emailIdx[email] = m
                emailToAcc[m] = accId
                m += 1

        adj = [[] for _ in range(m)]
        for a in accounts:
            for i in range(2, len(a)):
                id1 = emailIdx[a[i]]
                id2 = emailIdx[a[i - 1]]
                adj[id1].append(id2)
                adj[id2].append(id1)

        emailGroup = defaultdict(list) # index of acc -> list of emails
        visited = [False] * m

        def bfs(start, accId):
            queue = deque([start])
            visited[start] = True
            while queue:
                node = queue.popleft()
                emailGroup[accId].append(emails[node])
                for nei in adj[node]:
                    if not visited[nei]:
                        visited[nei] = True
                        queue.append(nei)

        for i in range(m):
            if not visited[i]:
                bfs(i, emailToAcc[i])

        res = []
        for accId in emailGroup:
            name = accounts[accId][0]
            res.append([name] + sorted(emailGroup[accId]))

        return res

Time & Space Complexity

  • Time complexity: O((nm)log(nm))O((n * m)\log (n * m))
  • Space complexity: O(nm)O(n * m)

Where nn is the number of accounts and mm is the number of emails.


3. Disjoint Set Union

Intuition

Union-Find (Disjoint Set Union) is designed for exactly this type of problem: grouping elements into disjoint sets and merging sets efficiently. Instead of building a graph and traversing it, we assign each account an ID and union accounts that share an email. When we see an email for the first time, we record which account it belongs to. If we see it again, we union the current account with the one that first owned it. After processing all accounts, we group emails by their account's root representative.

Algorithm

  1. Initialize Union-Find with one node per account.
  2. Create a map from email to the first account index that contains it.
  3. For each account's emails:
    • If the email was seen before, union the current account with the previous owner.
    • Otherwise, record the current account as the owner.
  4. For each email, find the root of its owning account and group emails by root.
  5. For each group, sort the emails and prepend the account name.
  6. Return the merged accounts.
class UnionFind:
    def __init__(self, n):
        self.par = [i for i in range(n)]
        self.rank = [1] * n

    def find(self, x):
        while x != self.par[x]:
            self.par[x] = self.par[self.par[x]]
            x = self.par[x]
        return x

    def union(self, x1, x2):
        p1, p2 = self.find(x1), self.find(x2)
        if p1 == p2:
            return False
        if self.rank[p1] > self.rank[p2]:
            self.par[p2] = p1
            self.rank[p1] += self.rank[p2]
        else:
            self.par[p1] = p2
            self.rank[p2] += self.rank[p1]
        return True

class Solution:
    def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
        uf = UnionFind(len(accounts))
        emailToAcc = {}  # email -> index of acc

        for i, a in enumerate(accounts):
            for e in a[1:]:
                if e in emailToAcc:
                    uf.union(i, emailToAcc[e])
                else:
                    emailToAcc[e] = i

        emailGroup = defaultdict(list)  # index of acc -> list of emails
        for e, i in emailToAcc.items():
            leader = uf.find(i)
            emailGroup[leader].append(e)

        res = []
        for i, emails in emailGroup.items():
            name = accounts[i][0]
            res.append([name] + sorted(emailGroup[i]))
        return res

Time & Space Complexity

  • Time complexity: O((nm)log(nm))O((n * m)\log (n * m))
  • Space complexity: O(nm)O(n * m)

Where nn is the number of accounts and mm is the number of emails.


Common Pitfalls

Treating Same Name as Same Person

Two accounts with the same name are NOT necessarily the same person. They are only the same person if they share at least one email. Merging accounts based solely on name will produce incorrect results.

# Wrong: merging by name
if accounts[i][0] == accounts[j][0]:
    merge(i, j)
# Correct: merge only when emails overlap
if email in emailToAccount:
    union(i, emailToAccount[email])

Forgetting to Sort Emails in Output

The problem requires emails in each merged account to be sorted in lexicographical order. Forgetting this step will produce results in arbitrary order that may fail validation.

Using Wrong Account Name After Merging

When merging accounts, you must use the name from one of the original accounts in the merged group. A common mistake is losing track of which account's name to use, especially in Union-Find where you need to get the name from the representative's original account.