Topological Sort: Ordering Things That Depend on Each Other
How topological sorting works for directed acyclic graphs, two different algorithms to implement it, cycle detection, and the real-world problems it solves.
Topological sorting answers a deceptively simple question: given a bunch of tasks where some tasks must come before others, what's a valid order to do them all?
Course prerequisites. Build system dependencies. Compilation order. Spreadsheet cell evaluation. Task scheduling. They're all the same problem underneath: you have a directed acyclic graph (DAG), and you need a linear ordering of its vertices such that for every directed edge u -> v, u comes before v in the ordering.
If that sounds trivial — "just do the ones with no prerequisites first" — congratulations, you've already intuited Kahn's algorithm.
Why "Acyclic" Matters
Topological sort only works on DAGs — directed acyclic graphs. If there's a cycle (A depends on B, B depends on C, C depends on A), no valid ordering exists. You can't satisfy circular dependencies.
This means topological sort doubles as a cycle detector. If you try to topologically sort a graph and can't include all vertices, there's a cycle.
Kahn's Algorithm (BFS-Based)
The intuition: find all nodes with no incoming edges (in-degree 0). They have no prerequisites, so they can go first. Process them, remove their outgoing edges, and repeat. New nodes with in-degree 0 appear as you remove edges.
from collections import deque, defaultdict
def topological_sort_kahn(n, edges):
# build adjacency list and in-degree count
graph = defaultdict(list)
in_degree = [0] * n
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
# start with all nodes that have no prerequisites
queue = deque()
for i in range(n):
if in_degree[i] == 0:
queue.append(i)
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
if len(order) == n:
return order # valid topological order
else:
return [] # cycle detected — couldn't process all nodes
That's the whole algorithm. Maintain in-degree counts. Process nodes with zero in-degree. Decrement neighbors' in-degrees. Repeat.
Time complexity: O(V + E). You visit every vertex once and process every edge once. Space: O(V + E) for the graph and the queue.
Why Cycle Detection Works
If there's a cycle, every node in the cycle always has at least one incoming edge from another node in the cycle. No node in the cycle ever reaches in-degree 0, so none of them get added to the queue. The result has fewer than n elements, and you know there's a cycle.
DFS-Based Topological Sort
The other approach: run DFS, and add each node to the result after you've finished exploring all its descendants. Then reverse the result.
def topological_sort_dfs(n, edges):
graph = defaultdict(list)
for u, v in edges:
graph[u].append(v)
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * n
order = []
has_cycle = False
def dfs(node):
nonlocal has_cycle
if has_cycle:
return
color[node] = GRAY # currently being explored
for neighbor in graph[node]:
if color[neighbor] == GRAY:
has_cycle = True # back edge = cycle
return
if color[neighbor] == WHITE:
dfs(neighbor)
color[node] = BLACK # done exploring
order.append(node)
for i in range(n):
if color[i] == WHITE:
dfs(i)
if has_cycle:
return []
order.reverse()
return order
The three-color scheme (WHITE = unvisited, GRAY = in progress, BLACK = done) is how you detect cycles during DFS. If you encounter a GRAY node while exploring, you've found a back edge — a cycle.
Why Reversing Works
DFS adds a node to the result after all its descendants are processed. So if u -> v, v gets added before u (because v's subtree finishes first). Reversing puts u before v, which is what we want.
This is sometimes called "reverse post-order" traversal.
Kahn's vs. DFS — Which to Use?
Both are O(V + E). The differences are practical:
Kahn's (BFS) is better when:- You need to detect cycles explicitly (the "not all nodes processed" check is very clean)
- You want to process nodes level by level (useful for finding the longest path or parallel scheduling)
- You want a non-recursive solution (no stack overflow risk on deep graphs)
- You need to find all possible topological orderings (easier to reason about with BFS)
- You're already doing DFS for other reasons (e.g., finding strongly connected components)
- You need the reverse post-order for other algorithms (like SCC decomposition)
- The implementation needs to be short (the recursive version is compact)
Classic Problems
Course Schedule
Given n courses and prerequisite pairs, determine if you can finish all courses. This is literally "does a valid topological ordering exist?" — which is the same as "is the graph acyclic?"
def can_finish(num_courses, prerequisites):
graph = defaultdict(list)
in_degree = [0] * num_courses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
queue = deque(i for i in range(num_courses) if in_degree[i] == 0)
count = 0
while queue:
node = queue.popleft()
count += 1
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return count == num_courses
Course Schedule II
Same thing, but return the actual ordering. Just collect the nodes as you process them (which Kahn's already does naturally).
Alien Dictionary
Given a sorted list of words in an alien language, deduce the character ordering. Each adjacent pair of words gives you one edge: find the first differing character, and the character in the earlier word comes before the one in the later word. Build the graph, topologically sort.
def alien_order(words):
# build graph from adjacent word pairs
graph = defaultdict(set)
in_degree = {c: 0 for word in words for c in word}
for i in range(len(words) - 1):
w1, w2 = words[i], words[i + 1]
min_len = min(len(w1), len(w2))
# edge case: "abc" before "ab" is invalid
if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]:
return ""
for j in range(min_len):
if w1[j] != w2[j]:
if w2[j] not in graph[w1[j]]:
graph[w1[j]].add(w2[j])
in_degree[w2[j]] += 1
break
# Kahn's algorithm
queue = deque(c for c in in_degree if in_degree[c] == 0)
result = []
while queue:
c = queue.popleft()
result.append(c)
for neighbor in graph[c]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
if len(result) == len(in_degree):
return "".join(result)
return "" # cycle
This is a great example of how topological sort shows up in non-obvious places. The hard part isn't the topological sort — it's recognizing that the problem is a topological sort problem.
Longest Path in a DAG
In a DAG, you can find the longest path using topological sort + dynamic programming. Process nodes in topological order, and for each node, update the distances of its neighbors.
def longest_path(n, edges):
graph = defaultdict(list)
in_degree = [0] * n
for u, v, w in edges: # weighted edges
graph[u].append((v, w))
in_degree[v] += 1
# Kahn's to get topological order
queue = deque(i for i in range(n) if in_degree[i] == 0)
topo_order = []
while queue:
node = queue.popleft()
topo_order.append(node)
for neighbor, _ in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# DP on topological order
dist = [0] * n
for node in topo_order:
for neighbor, weight in graph[node]:
dist[neighbor] = max(dist[neighbor], dist[node] + weight)
return max(dist)
This runs in O(V + E) — you can't do better. Note that longest path in a general graph is NP-hard, but the acyclic property makes it tractable.
Multiple Valid Orderings
A DAG usually has many valid topological orderings, not just one. For example, if A -> C and B -> C, both [A, B, C] and [B, A, C] are valid. Kahn's algorithm produces one valid ordering based on the order nodes reach in-degree 0.
If you need a specific ordering (like lexicographically smallest), replace the queue with a min-heap:
import heapq
def topological_sort_lex(n, edges):
graph = defaultdict(list)
in_degree = [0] * n
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
heap = [i for i in range(n) if in_degree[i] == 0]
heapq.heapify(heap)
order = []
while heap:
node = heapq.heappop(heap) # smallest available node
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
heapq.heappush(heap, neighbor)
return order if len(order) == n else []
This bumps the complexity to O((V + E) log V) because of heap operations, but it guarantees lexicographic ordering.
Common Mistakes
Building the graph in the wrong direction. If "A must come before B" means edge A -> B, make sure you're not adding B -> A. Double-check which direction your edges go for the specific problem's semantics. Forgetting to handle disconnected components. A DAG can have multiple disconnected components. Make sure your initial queue/DFS starting set includes all nodes with in-degree 0, not just one of them. Not detecting cycles. If the problem says "prerequisites might be contradictory," you need to check for cycles. Always verify that your topological order includes all n nodes. Using topological sort on undirected graphs. Topological sort is for directed graphs only. An undirected graph with any edges has "cycles" (in the directed sense — each undirected edge is two directed edges), so topological sort doesn't apply.The Mental Model
Think of topological sort as peeling layers off an onion. Each layer is the set of nodes with no remaining prerequisites. Peel off the first layer, update the counts, peel the next layer. When you're done, either you've peeled the whole onion (valid ordering) or there's a hard core that won't peel (cycle).
Practice the Kahn's algorithm template on CodeUp — once you can write it from memory, the hard part of topological sort problems becomes just building the graph correctly. The sorting itself is mechanical.