Minimum Spanning Trees: Kruskal's and Prim's Algorithms Explained
Learn how Kruskal's and Prim's algorithms find minimum spanning trees — step-by-step explanations, Python implementations, and when to use each.
Imagine you need to connect every city in a region with roads, and you want to minimize the total construction cost. Every city must be reachable from every other city, but you don't need multiple routes between cities — just one connected network with minimum total cost. That's the Minimum Spanning Tree problem.
It sounds like a niche problem until you realize it shows up everywhere: network design (connecting computers with minimum cable), circuit board layout (minimum wire length), clustering algorithms, and approximation algorithms for harder problems like the Traveling Salesman.
Two algorithms dominate this space: Kruskal's and Prim's. Both find the exact same optimal solution (the MST), but they approach it from different directions. Let's understand both.
What a Spanning Tree Is
Given a connected, undirected, weighted graph with V vertices and E edges:
- A spanning tree is a subgraph that includes all V vertices, is connected, and has exactly V - 1 edges (no cycles).
- A minimum spanning tree (MST) is the spanning tree with the smallest total edge weight.
- A graph with V vertices has exactly V - 1 edges in its spanning tree
- The MST is unique if all edge weights are distinct
- If some edge weights are equal, there may be multiple valid MSTs (all with the same total cost)
- Removing any edge from the MST disconnects the graph
- Adding any non-MST edge creates exactly one cycle
The Cut Property
Both Kruskal's and Prim's rely on the cut property: for any cut (partition of vertices into two groups), the minimum-weight edge crossing the cut must be in the MST.
This is the theoretical foundation. If you can prove an edge is the cheapest way to connect two components, it must be in the MST. Both algorithms exploit this fact — they just find those cheapest crossing edges in different ways.
Kruskal's Algorithm: Sort Edges, Use Union-Find
Kruskal's strategy is simple: sort all edges by weight, then greedily add the cheapest edge that doesn't create a cycle. Repeat until you have V - 1 edges.
The "doesn't create a cycle" check is where Union-Find shines. Two endpoints are in different components? Add the edge and merge the components. Same component? Skip the edge (it would create a cycle).
Step by Step
Graph:
A --4-- B
| |
2 3
| |
C --1-- D --5-- E
| |
6 7
| |
F --8-- G
- Sort edges by weight:
C-D: 1, A-C: 2, B-D: 3, A-B: 4, D-E: 5, C-F: 6, E-G: 7, F-G: 8
- Process edges:
C-D: 1 → add (C and D in different sets)
A-C: 2 → add (A and {C,D} in different sets)
B-D: 3 → add (B and {A,C,D} in different sets)
A-B: 4 → SKIP (A and B already connected)
D-E: 5 → add (E and {A,B,C,D} in different sets)
C-F: 6 → add (F and {A,B,C,D,E} in different sets)
E-G: 7 → add (G and {A,B,C,D,E,F} in different sets)
F-G: 8 → SKIP (F and G already connected)
MST edges: C-D(1), A-C(2), B-D(3), D-E(5), C-F(6), E-G(7)
Total weight: 24
Python Implementation
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False # already in same set
# union by rank
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
return True
def kruskal(num_vertices, edges):
"""
Find MST using Kruskal's algorithm.
edges: list of (weight, u, v)
Returns: list of MST edges and total weight
"""
edges.sort() # sort by weight
uf = UnionFind(num_vertices)
mst_edges = []
total_weight = 0
for weight, u, v in edges:
if uf.union(u, v):
mst_edges.append((u, v, weight))
total_weight += weight
if len(mst_edges) == num_vertices - 1:
break # MST complete
return mst_edges, total_weight
# Example
edges = [
(4, 0, 1), # A-B
(2, 0, 2), # A-C
(3, 1, 3), # B-D
(1, 2, 3), # C-D
(5, 3, 4), # D-E
(6, 2, 5), # C-F
(7, 4, 6), # E-G
(8, 5, 6), # F-G
]
mst, weight = kruskal(7, edges)
print(f"MST edges: {mst}")
print(f"Total weight: {weight}")
# MST edges: [(2, 3, 1), (0, 2, 2), (1, 3, 3), (3, 4, 5), (2, 5, 6), (4, 6, 7)]
# Total weight: 24
Time Complexity
- Sorting edges: O(E log E)
- Processing edges with Union-Find: O(E * alpha(V)), where alpha is the inverse Ackermann function (effectively O(1))
- Total: O(E log E), which is equivalent to O(E log V) since E <= V^2
Prim's Algorithm: Grow the Tree Greedily
Prim's takes the opposite approach: start from any vertex and grow the MST one vertex at a time. At each step, add the cheapest edge that connects a vertex in the tree to a vertex outside the tree.
This is essentially Dijkstra's algorithm but instead of tracking the shortest path to each vertex, you track the cheapest edge connecting each vertex to the current tree.
Step by Step
Starting from vertex A:
Step 1: Tree = {A}
Edges to consider: A-B(4), A-C(2)
Cheapest: A-C(2) → add C
Tree = {A, C}
Step 2: Edges to consider: A-B(4), C-D(1), C-F(6)
Cheapest: C-D(1) → add D
Tree = {A, C, D}
Step 3: Edges to consider: A-B(4), D-B(3), D-E(5), C-F(6)
Cheapest: D-B(3) → add B
Tree = {A, B, C, D}
Step 4: Edges to consider: D-E(5), C-F(6)
Cheapest: D-E(5) → add E
Tree = {A, B, C, D, E}
Step 5: Edges to consider: C-F(6), E-G(7)
Cheapest: C-F(6) → add F
Tree = {A, B, C, D, E, F}
Step 6: Edges to consider: E-G(7), F-G(8)
Cheapest: E-G(7) → add G
Tree = {A, B, C, D, E, F, G}
MST complete. Same result: total weight 24.
Python Implementation
import heapq
def prim(num_vertices, adj):
"""
Find MST using Prim's algorithm with a min-heap.
adj: adjacency list — adj[u] = [(weight, v), ...]
Returns: list of MST edges and total weight
"""
visited = [False] * num_vertices
mst_edges = []
total_weight = 0
# Start from vertex 0
# Heap entries: (weight, current_vertex, parent_vertex)
heap = [(0, 0, -1)]
while heap and len(mst_edges) < num_vertices:
weight, u, parent = heapq.heappop(heap)
if visited[u]:
continue
visited[u] = True
if parent != -1:
mst_edges.append((parent, u, weight))
total_weight += weight
for edge_weight, v in adj[u]:
if not visited[v]:
heapq.heappush(heap, (edge_weight, v, u))
return mst_edges, total_weight
# Build adjacency list
adj = [[] for _ in range(7)]
edges_list = [
(0, 1, 4), (0, 2, 2), (1, 3, 3), (2, 3, 1),
(3, 4, 5), (2, 5, 6), (4, 6, 7), (5, 6, 8),
]
for u, v, w in edges_list:
adj[u].append((w, v))
adj[v].append((w, u))
mst, weight = prim(7, adj)
print(f"MST edges: {mst}")
print(f"Total weight: {weight}")
Time Complexity
With a binary heap (Python's heapq):
- Each vertex is popped once: O(V log V)
- Each edge is pushed at most twice: O(E log V)
- Total: O((V + E) log V)
With a Fibonacci heap (theoretical improvement):
- Total: O(E + V log V)
In practice, the binary heap version is fast enough for most problems.
Kruskal's vs Prim's: When to Use Which
| Kruskal's | Prim's | |
|---|---|---|
| Strategy | Sort edges globally | Grow tree locally |
| Data structure | Union-Find | Priority queue (min-heap) |
| Time complexity | O(E log E) | O((V + E) log V) |
| Best for | Sparse graphs (E close to V) | Dense graphs (E close to V^2) |
| Needs edge list? | Yes | No (adjacency list) |
| Easy to implement? | Yes (with Union-Find) | Yes (with heap) |
- The graph is sparse (few edges relative to vertices)
- You already have an edge list
- You want to process edges in a distributed/parallel setting
- The graph is dense (many edges)
- You have an adjacency list or adjacency matrix
- You want to grow the tree from a specific starting vertex
Real-World Applications
Network design. Connecting offices with leased lines, laying fiber optic cable, designing water pipe networks. The MST minimizes total connection cost while ensuring every location is reachable. Clustering. Single-linkage clustering is equivalent to building an MST and removing the most expensive edges. This creates clusters of closely related data points. Approximation algorithms. The MST is used to approximate the Traveling Salesman Problem. An MST-based tour is at most 2x the optimal TSP tour for metric spaces. Christofides' algorithm (using MST + minimum matching) gives a 1.5x approximation. Image segmentation. Pixel-level graph where edge weights represent color difference. The MST helps identify boundaries between regions. Maze generation. Randomized Kruskal's or randomized Prim's on a grid graph generates random mazes — the MST creates a perfect maze with exactly one path between any two cells.Advanced: Second-Best MST
A common follow-up problem: find the MST with the second-smallest total weight. This can be solved in O(E log V) by:
- Find the MST
- For each non-MST edge (u, v), adding it creates a cycle. Find the maximum-weight edge on the path from u to v in the MST (using LCA with max-weight tracking)
- The second-best MST replaces that max edge with the non-MST edge. Try all non-MST edges and take the minimum increase.
Common Mistakes
Forgetting the graph must be connected. Both algorithms assume a connected graph. If the graph is disconnected, there's no spanning tree — you'd get a minimum spanning forest (one tree per connected component). Check that your MST has exactly V - 1 edges. Off-by-one in vertex numbering. If vertices are 1-indexed but your Union-Find or adjacency list is 0-indexed, you'll get wrong results or index errors. Be consistent. Not handling duplicate edge weights. When edges have equal weights, both algorithms still work correctly, but the MST may not be unique. Don't assume there's only one correct answer. Using adjacency matrix with Kruskal's. Kruskal's needs an edge list. If you only have an adjacency matrix, extract the edges first. For dense graphs where you'd already have a matrix, Prim's might be more natural. Not breaking out of Kruskal's early. Once you have V - 1 edges in the MST, stop processing. Every remaining edge would create a cycle. This optimization matters when E is much larger than V.What's Next
Minimum spanning trees are a cornerstone of graph theory with connections to many other algorithms and data structures. Once you're comfortable with Kruskal's and Prim's, explore:
- Boruvka's algorithm — the oldest MST algorithm (1926), useful for parallel MST construction
- Second-best MST — builds on MST + LCA
- Minimum Steiner Tree — MST over a subset of vertices (NP-hard in general)
- Dynamic MST — maintaining the MST as edges are added or removed
For more algorithm tutorials and data structure guides, check out CodeUp.