March 26, 20269 min read

Topological Sort: Ordering Things That Depend on Each Other

How topological sorting works for directed acyclic graphs, two different algorithms to implement it, cycle detection, and the real-world problems it solves.

topological-sort graphs algorithms dag interviews

Topological sorting answers a deceptively simple question: given a bunch of tasks where some tasks must come before others, what's a valid order to do them all?

Course prerequisites. Build system dependencies. Compilation order. Spreadsheet cell evaluation. Task scheduling. They're all the same problem underneath: you have a directed acyclic graph (DAG), and you need a linear ordering of its vertices such that for every directed edge u -> v, u comes before v in the ordering.

If that sounds trivial — "just do the ones with no prerequisites first" — congratulations, you've already intuited Kahn's algorithm.

Why "Acyclic" Matters

Topological sort only works on DAGs — directed acyclic graphs. If there's a cycle (A depends on B, B depends on C, C depends on A), no valid ordering exists. You can't satisfy circular dependencies.

This means topological sort doubles as a cycle detector. If you try to topologically sort a graph and can't include all vertices, there's a cycle.

Kahn's Algorithm (BFS-Based)

The intuition: find all nodes with no incoming edges (in-degree 0). They have no prerequisites, so they can go first. Process them, remove their outgoing edges, and repeat. New nodes with in-degree 0 appear as you remove edges.

from collections import deque, defaultdict

def topological_sort_kahn(n, edges):
    # build adjacency list and in-degree count
    graph = defaultdict(list)
    in_degree = [0] * n

for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

# start with all nodes that have no prerequisites
    queue = deque()
    for i in range(n):
        if in_degree[i] == 0:
            queue.append(i)

order = []
    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

if len(order) == n:
        return order  # valid topological order
    else:
        return []  # cycle detected — couldn't process all nodes

That's the whole algorithm. Maintain in-degree counts. Process nodes with zero in-degree. Decrement neighbors' in-degrees. Repeat.

Time complexity: O(V + E). You visit every vertex once and process every edge once. Space: O(V + E) for the graph and the queue.

Why Cycle Detection Works

If there's a cycle, every node in the cycle always has at least one incoming edge from another node in the cycle. No node in the cycle ever reaches in-degree 0, so none of them get added to the queue. The result has fewer than n elements, and you know there's a cycle.

DFS-Based Topological Sort

The other approach: run DFS, and add each node to the result after you've finished exploring all its descendants. Then reverse the result.

def topological_sort_dfs(n, edges):
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)

WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * n
    order = []
    has_cycle = False

def dfs(node):
        nonlocal has_cycle
        if has_cycle:
            return
        color[node] = GRAY  # currently being explored

for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                has_cycle = True  # back edge = cycle
                return
            if color[neighbor] == WHITE:
                dfs(neighbor)

color[node] = BLACK  # done exploring
        order.append(node)

for i in range(n):
        if color[i] == WHITE:
            dfs(i)

if has_cycle:
        return []

order.reverse()
    return order

The three-color scheme (WHITE = unvisited, GRAY = in progress, BLACK = done) is how you detect cycles during DFS. If you encounter a GRAY node while exploring, you've found a back edge — a cycle.

Why Reversing Works

DFS adds a node to the result after all its descendants are processed. So if u -> v, v gets added before u (because v's subtree finishes first). Reversing puts u before v, which is what we want.

This is sometimes called "reverse post-order" traversal.

Kahn's vs. DFS — Which to Use?

Both are O(V + E). The differences are practical:

Kahn's (BFS) is better when:

You need to detect cycles explicitly (the "not all nodes processed" check is very clean)
You want to process nodes level by level (useful for finding the longest path or parallel scheduling)
You want a non-recursive solution (no stack overflow risk on deep graphs)
You need to find all possible topological orderings (easier to reason about with BFS)

DFS is better when:

You're already doing DFS for other reasons (e.g., finding strongly connected components)
You need the reverse post-order for other algorithms (like SCC decomposition)
The implementation needs to be short (the recursive version is compact)

In interviews, I'd recommend Kahn's. It's more intuitive, the cycle detection is obvious, and the BFS pattern is familiar to most people.

Classic Problems

Course Schedule

Given n courses and prerequisite pairs, determine if you can finish all courses. This is literally "does a valid topological ordering exist?" — which is the same as "is the graph acyclic?"

def can_finish(num_courses, prerequisites):
    graph = defaultdict(list)
    in_degree = [0] * num_courses

for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1

queue = deque(i for i in range(num_courses) if in_degree[i] == 0)
    count = 0

while queue:
        node = queue.popleft()
        count += 1
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

return count == num_courses

Course Schedule II

Same thing, but return the actual ordering. Just collect the nodes as you process them (which Kahn's already does naturally).

Alien Dictionary

Given a sorted list of words in an alien language, deduce the character ordering. Each adjacent pair of words gives you one edge: find the first differing character, and the character in the earlier word comes before the one in the later word. Build the graph, topologically sort.

def alien_order(words):
    # build graph from adjacent word pairs
    graph = defaultdict(set)
    in_degree = {c: 0 for word in words for c in word}

for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        min_len = min(len(w1), len(w2))

# edge case: "abc" before "ab" is invalid
        if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]:
            return ""

for j in range(min_len):
            if w1[j] != w2[j]:
                if w2[j] not in graph[w1[j]]:
                    graph[w1[j]].add(w2[j])
                    in_degree[w2[j]] += 1
                break

# Kahn's algorithm
    queue = deque(c for c in in_degree if in_degree[c] == 0)
    result = []

while queue:
        c = queue.popleft()
        result.append(c)
        for neighbor in graph[c]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

if len(result) == len(in_degree):
        return "".join(result)
    return ""  # cycle

This is a great example of how topological sort shows up in non-obvious places. The hard part isn't the topological sort — it's recognizing that the problem is a topological sort problem.

Longest Path in a DAG

In a DAG, you can find the longest path using topological sort + dynamic programming. Process nodes in topological order, and for each node, update the distances of its neighbors.

def longest_path(n, edges):
    graph = defaultdict(list)
    in_degree = [0] * n

for u, v, w in edges:  # weighted edges
        graph[u].append((v, w))
        in_degree[v] += 1

# Kahn's to get topological order
    queue = deque(i for i in range(n) if in_degree[i] == 0)
    topo_order = []
    while queue:
        node = queue.popleft()
        topo_order.append(node)
        for neighbor, _ in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

# DP on topological order
    dist = [0] * n
    for node in topo_order:
        for neighbor, weight in graph[node]:
            dist[neighbor] = max(dist[neighbor], dist[node] + weight)

return max(dist)

This runs in O(V + E) — you can't do better. Note that longest path in a general graph is NP-hard, but the acyclic property makes it tractable.

Multiple Valid Orderings

A DAG usually has many valid topological orderings, not just one. For example, if A -> C and B -> C, both [A, B, C] and [B, A, C] are valid. Kahn's algorithm produces one valid ordering based on the order nodes reach in-degree 0.

If you need a specific ordering (like lexicographically smallest), replace the queue with a min-heap:

import heapq

def topological_sort_lex(n, edges):
    graph = defaultdict(list)
    in_degree = [0] * n

for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

heap = [i for i in range(n) if in_degree[i] == 0]
    heapq.heapify(heap)
    order = []

while heap:
        node = heapq.heappop(heap)  # smallest available node
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                heapq.heappush(heap, neighbor)

return order if len(order) == n else []

This bumps the complexity to O((V + E) log V) because of heap operations, but it guarantees lexicographic ordering.

Common Mistakes

Building the graph in the wrong direction. If "A must come before B" means edge A -> B, make sure you're not adding B -> A. Double-check which direction your edges go for the specific problem's semantics. Forgetting to handle disconnected components. A DAG can have multiple disconnected components. Make sure your initial queue/DFS starting set includes all nodes with in-degree 0, not just one of them. Not detecting cycles. If the problem says "prerequisites might be contradictory," you need to check for cycles. Always verify that your topological order includes all n nodes. Using topological sort on undirected graphs. Topological sort is for directed graphs only. An undirected graph with any edges has "cycles" (in the directed sense — each undirected edge is two directed edges), so topological sort doesn't apply.

The Mental Model

Think of topological sort as peeling layers off an onion. Each layer is the set of nodes with no remaining prerequisites. Peel off the first layer, update the counts, peel the next layer. When you're done, either you've peeled the whole onion (valid ordering) or there's a hard core that won't peel (cycle).

Practice the Kahn's algorithm template on CodeUp — once you can write it from memory, the hard part of topological sort problems becomes just building the graph correctly. The sorting itself is mechanical.