Topological Sort

An algorithm that arranges nodes in a directed graph into a linear sequence where every node appears after all of its dependencies.


What is it?

A topological sort takes a directed acyclic graph (DAG) --- a set of nodes-and-edges where edges have direction and no cycles exist --- and produces a linear ordering where every node comes after everything it depends on.1

The problem it solves is universal: given a set of tasks with dependencies, in what order should you do them so that you never start a task before its prerequisites are finished? This is the same question whether you are compiling software, scheduling university courses, planning a construction project, or building a learning path through a knowledge graph.2

The key constraint is that the graph must be acyclic --- there can be no circular dependencies. If A depends on B and B depends on A, no valid ordering exists. Detecting and breaking such cycles is an essential part of the process.3

Topological sort is not one specific algorithm but a class of solutions. The two most common are Kahn’s algorithm (which works by repeatedly removing nodes with no remaining dependencies) and DFS-based post-order (which works by exploring paths to their end and then recording nodes in reverse). Both produce a valid ordering, though the specific order may differ when multiple valid orderings exist.1

In plain terms

A topological sort is like figuring out the right order to get dressed in the morning. You cannot put on shoes before trousers, and you cannot put on trousers before underwear. Given all those “must come before” rules, a topological sort produces a sequence that respects every one of them.


At a glance


How does it work?

1. The dependency graph

The input is a directed acyclic graph (DAG). Each node is a task, concept, or item. Each directed edge means “this must come before that.” The edge from Underwear to Trousers means “Underwear must be done before Trousers.”1

The graph must be acyclic --- if you can follow edges from a node and eventually return to the same node, that is a cycle, and no topological ordering is possible. For example: if Course A requires Course B, and Course B requires Course A, no student can ever begin either course.

Think of it like...

A recipe where step 3 says “add the mixture from step 2” and step 2 says “chop the ingredients from step 1.” Each step points forward. If step 3 said “go back to step 1,” you would be stuck in an infinite loop --- that is a cycle.

2. Kahn’s algorithm (remove the ready items)

Kahn’s algorithm is the most intuitive approach:2

  1. Find all nodes with no incoming edges (no dependencies). These are “ready” --- nothing needs to happen before them
  2. Remove one of these nodes from the graph and add it to the output list
  3. Remove all edges leaving that node (its dependents now have one fewer dependency to wait for)
  4. Repeat: check for new nodes with no incoming edges, remove them, and continue
  5. When the graph is empty, the output list is a valid topological order

If the graph is not empty but no nodes have zero incoming edges, a cycle exists and the sort is impossible.

3. DFS-based approach (explore to the end, then record)

The alternative approach uses depth-first search (DFS):3

  1. Pick any unvisited node and follow its edges as deep as possible
  2. When you reach a node with no unvisited dependents, add it to the front of the output list (or record it in reverse)
  3. Backtrack and continue exploring unvisited paths
  4. Repeat until all nodes are visited

The result is the same: a valid topological order. DFS-based sorting is often preferred in implementations because it naturally integrates with cycle detection --- if the DFS encounters a node it is currently processing (not just one it has finished), a cycle exists.3

Think of it like...

Imagine untangling a chain of paperclips. You grab one end and pull it out as far as it goes. The last clip in the chain (the one with nothing after it) gets placed first in your sorted line. Then you backtrack and handle the remaining branches.

4. Cycle detection

A cycle means a circular dependency: A needs B, B needs C, C needs A. No valid ordering exists because there is no starting point.1

Both algorithms detect cycles as a side effect:

  • Kahn’s: if the graph is not empty at the end but no node has zero incoming edges, a cycle remains
  • DFS: if you encounter a node that is currently in the recursion stack (grey node), you have found a back edge --- a cycle

When a cycle is detected, the system must either report an error or break the cycle by removing one of the offending edges.

Key distinction

A tree (like a taxonomy) is always acyclic --- you can always topologically sort it. A general directed graph might contain cycles, which must be detected and resolved before sorting is possible.


Why do we use it?

Key reasons

1. Dependency resolution. Any system where tasks have prerequisites needs topological sort. Package managers (npm, pip, apt) use it to install dependencies in the right order. Build systems (Make, Gradle) use it to compile source files before the files that depend on them.2

2. Learning path generation. In a knowledge graph with prerequisite edges, topological sort produces a valid learning sequence --- ensuring a student encounters every concept only after mastering its prerequisites.4

3. Scheduling. Project management tools use topological sort to determine the critical path --- the longest chain of dependent tasks that determines the minimum project duration.1

4. Cycle detection. Running a topological sort is the standard way to check whether a directed graph contains cycles. If the sort fails, cycles exist --- and the algorithm can identify where they are.


When do we use it?

  • When installing software packages that depend on other packages
  • When compiling code where source files import other source files
  • When generating a learning path from a graph of concepts with prerequisites
  • When scheduling tasks in a project where some tasks must finish before others can start
  • When validating a dependency graph to ensure it contains no circular references

Rule of thumb

If your problem involves “do things in the right order, given a set of must-come-before rules,” you are looking at a topological sort problem.


How can I think about it?

The getting-dressed analogy

Getting dressed in the morning is a topological sort.

  • Each garment is a node
  • Each must-come-before rule is a directed edge (underwear before trousers, trousers before belt)
  • Some garments are independent --- socks and shirt have no dependency between them, so either can come first
  • The sorted order is a valid sequence for getting dressed
  • A cycle would be like saying “you must wear your jacket before your shirt, and your shirt before your jacket” --- impossible, so you would have to break one of the rules
  • Multiple valid orderings exist: you could put on socks before or after your shirt, as long as both come before shoes

The construction project

Building a house is a topological sort of construction tasks.

  • Foundation must come before walls
  • Walls must come before roof
  • Roof must come before interior painting
  • Electrical wiring must come before drywall, which must come before painting
  • Plumbing can happen in parallel with electrical (no dependency between them)
  • The project manager runs a topological sort to produce a valid construction schedule
  • If someone accidentally creates a circular dependency (“painting must finish before we pour the foundation”), the sort fails and flags the error
  • The critical path (longest chain of dependencies) determines the minimum project duration

Concepts to explore next

ConceptWhat it coversStatus
taxonomiesHierarchical classification --- always a DAG, always sortablecomplete
knowledge-graphsThe broader graph structure that topological sort operates oncomplete
llm-pipelinesHow topological sort orders the stages of an LLM processing pipelinestub

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    KG[Knowledge Graphs] --> NE[Nodes and Edges]
    KG --> TAX[Taxonomies]
    KG --> TS[Topological Sort]
    TS -.->|related| TAX
    TS -.->|related| LP[LLM Pipelines]
    style TS fill:#4a9ede,color:#fff

Related concepts:

  • taxonomies --- a taxonomy is always a DAG, so it can always be topologically sorted; the sorted order represents a path from general to specific
  • knowledge-graphs --- topological sort is one of the key algorithms for traversing and making use of a knowledge graph’s dependency structure
  • llm-pipelines --- multi-step AI pipelines often have stage dependencies that are resolved by topological sorting

Sources


Further reading

Resources

Footnotes

  1. GeeksforGeeks. (2024). Topological Sorting. GeeksforGeeks. 2 3 4 5

  2. Harsanyi, T. (2024). Topological Sort: Ordering Tasks with Dependencies. The Coder Cafe. 2 3

  3. Codeintuition. (2026). Topological Sort Explained from First Principles. Codeintuition. 2 3

  4. Kumar, A. (2025). Topological Sorting Explained: A Step-by-Step Guide for Dependency Resolution. Medium.