You are given a 2-D integer array points, where points[i] = [xi, yi]. Each points[i] represents a distinct point on a 2-D plane.
The cost of connecting two points [xi, yi] and [xj, yj] is the manhattan distance between the two points, i.e. |xi - xj| + |yi - yj|.
Return the minimum cost to connect all points together, such that there exists exactly one path between each pair of points.
Example 1:
Input: points = [[0,0],[2,2],[3,3],[2,4],[4,2]]
Output: 10Constraints:
1 <= points.length <= 1000-1,000,000 <= xi, yi <= 1,000,000(xi, yi) are distinct.
You should aim for a solution as good or better than O((n^2)logn) time and O(n^2) space, where n is the number of points.
Think of this problem as a graph, where the given points represent nodes. We need to connect these nodes into a single component by creating edges. Can you think of an advanced graph algorithm that can be used to connect all points into one component?
We use Kruskal's algorithm along with Union-Find (DSU) to connect nodes into components. The final component forms the minimum spanning tree (MST), where the edges between nodes are weighted by the Manhattan distance, and the total weight of the tree is minimized. How would you implement this?
We create the possible edges by iterating through every pair of points and calculating the weights as the Manhattan distance between them. Next, we sort the edges in ascending order based on their weights, as we aim to minimize the cost. Then, we traverse through these edges, connecting the nodes and adding the weight of the edge to the total cost if the edge is successfully added. The final result will be the minimum cost.
Before attempting this problem, you should be comfortable with:
We want to connect all points such that the total cost is minimum, where the cost to connect two points is their Manhattan distance.
This is exactly the definition of a Minimum Spanning Tree (MST).
Kruskal's Algorithm fits naturally:
To efficiently check whether two points are already connected, we use Disjoint Set Union (Union-Find).
class DSU:
def __init__(self, n):
self.Parent = list(range(n + 1))
self.Size = [1] * (n + 1)
def find(self, node):
if self.Parent[node] != node:
self.Parent[node] = self.find(self.Parent[node])
return self.Parent[node]
def union(self, u, v):
pu = self.find(u)
pv = self.find(v)
if pu == pv:
return False
if self.Size[pu] < self.Size[pv]:
pu, pv = pv, pu
self.Size[pu] += self.Size[pv]
self.Parent[pv] = pu
return True
class Solution:
def minCostConnectPoints(self, points: List[List[int]]) -> int:
n = len(points)
dsu = DSU(n)
edges = []
for i in range(n):
x1, y1 = points[i]
for j in range(i + 1, n):
x2, y2 = points[j]
dist = abs(x1 - x2) + abs(y1 - y2)
edges.append((dist, i, j))
edges.sort()
res = 0
for dist, u, v in edges:
if dsu.union(u, v):
res += dist
return resWe still want a Minimum Spanning Tree (MST): connect all points with minimum total Manhattan distance.
Prim's Algorithm grows the MST one node at a time:
0).A min-heap (priority queue) is used to always get the next cheapest edge quickly.
0.(0, 0) into a min-heap (cost to add starting node is 0).visited set for nodes already in the MST.visited size < N:(cost, node) from heap.node is already visited, skip it.cost to answer.node as visited.(edgeCost, neighbor) to heap for unvisited neighbors.class Solution:
def minCostConnectPoints(self, points: List[List[int]]) -> int:
N = len(points)
adj = {i: [] for i in range(N)}
for i in range(N):
x1, y1 = points[i]
for j in range(i + 1, N):
x2, y2 = points[j]
dist = abs(x1 - x2) + abs(y1 - y2)
adj[i].append([dist, j])
adj[j].append([dist, i])
res = 0
visit = set()
minH = [[0, 0]]
while len(visit) < N:
cost, i = heapq.heappop(minH)
if i in visit:
continue
res += cost
visit.add(i)
for neiCost, nei in adj[i]:
if nei not in visit:
heapq.heappush(minH, [neiCost, nei])
return resWe want to connect all points so that the total cost is minimum, where the cost to connect two points is their Manhattan distance.
This is exactly a Minimum Spanning Tree (MST) problem:
This solution uses Prim’s Algorithm (greedy MST building):
0).Instead of building all edges (which would be too many), we compute distances on the fly and update the best known connection cost for each node.
n be the number of points.visit[i] → whether point i is already included in the MSTdist[i] → the minimum cost to connect point i to the current MSTnode = 0 with total cost res = 0 and connected edges edges = 0.n - 1 edges:node as visited (added to MST).i:i from the current nodedist[i] = min(dist[i], cost)nextNode as the unvisited point with the smallest dist[i].dist[nextNode] to res (we are connecting nextNode to the MST).node = nextNode and increment edges.res as the total minimum cost.class Solution:
def minCostConnectPoints(self, points: List[List[int]]) -> int:
n, node = len(points), 0
dist = [100000000] * n
visit = [False] * n
edges, res = 0, 0
while edges < n - 1:
visit[node] = True
nextNode = -1
for i in range(n):
if visit[i]:
continue
curDist = (abs(points[i][0] - points[node][0]) +
abs(points[i][1] - points[node][1]))
dist[i] = min(dist[i], curDist)
if nextNode == -1 or dist[i] < dist[nextNode]:
nextNode = i
res += dist[nextNode]
node = nextNode
edges += 1
return resThe problem specifies Manhattan distance |x1 - x2| + |y1 - y2|, not Euclidean distance. Using the wrong distance formula will produce incorrect edge weights and ultimately a wrong MST cost.
When generating all edges between points, ensure you only create edges between distinct points (i.e., i != j). Including self-loops with zero cost can corrupt the Union-Find structure or cause infinite loops in some implementations.
A spanning tree with n nodes has exactly n-1 edges. Continuing to add edges after the MST is complete wastes time and, in some implementations, can cause issues. Always track the number of edges added and stop once you reach n-1.