Cognitive Load Theory

The theory that learning is constrained by the limited capacity of working memory, and that instruction should be designed to manage — not just reduce — the mental effort required to learn.

What is it?

In 1988, the Australian educational psychologist John Sweller proposed a simple but powerful idea: human working memory can only hold about four to five items at a time, and this bottleneck determines how effectively we learn.¹ If instruction overwhelms working memory, learning fails — not because the content is too hard, but because the delivery is poorly designed. Cognitive Load Theory (CLT) is the framework built on this insight.

The theory distinguishes three types of cognitive load, and this distinction is what makes it so useful. Intrinsic load comes from the inherent complexity of the material itself — some things are just harder to learn than others because they involve many interacting elements.² Extraneous load comes from poor instructional design — confusing layouts, irrelevant information, or formats that force the learner to split attention between multiple sources.³ Germane load is the productive effort the learner invests in building and automating schemas — the mental frameworks that organise knowledge for long-term storage.²

The critical insight is that not all cognitive effort is equal. Extraneous load wastes working memory on things that don’t help learning. Germane load uses working memory for things that do. The practical goal of CLT is therefore not to minimise total effort, but to eliminate extraneous load, manage intrinsic load through careful sequencing, and maximise germane load so that working memory is spent on schema construction.¹

CLT has generated a large body of empirical research identifying specific “effects” — predictable ways that instruction can help or hinder learning. The split-attention effect, the redundancy effect, and the worked-example effect are among the most well-established, each offering concrete guidance for designing better learning materials.³

In plain terms

Your brain has a small workbench (working memory) where you assemble new understanding. If you clutter that workbench with unnecessary tools, confusing instructions, or too many parts at once, you can’t build anything. Cognitive Load Theory tells you how to keep the workbench clear so the learner can focus on the actual building.

At a glance

The three types of cognitive load (click to expand)
graph TD
    WM[Working Memory] -->|limited capacity| TL[Total Load]
    TL --> IL[Intrinsic Load]
    TL --> EL[Extraneous Load]
    TL --> GL[Germane Load]
    IL -->|manage| SEQ[Sequencing]
    EL -->|eliminate| ID[Better Design]
    GL -->|maximise| SB[Schema Building]
    style WM fill:#4a9ede,color:#fff
Key: Working memory has a fixed capacity shared across three types of load. Intrinsic load is managed through sequencing, extraneous load is eliminated through better design, and germane load is maximised to support schema building.

How does it work?

Working memory — the bottleneck

Working memory is the cognitive system that holds and manipulates information you are currently thinking about. Research consistently shows it can handle roughly four items at a time (sometimes expressed as 4 plus or minus 1), and information fades from working memory within about 20 seconds unless actively rehearsed.¹ Long-term memory, by contrast, has effectively unlimited capacity — but information must pass through working memory to get there.

This asymmetry is the foundation of CLT. Learning is the process of building schemas in long-term memory, but the only route to long-term memory runs through a very narrow corridor. If that corridor is jammed, nothing gets through.²

Think of it like...

Working memory is like a small café table. You can only have a few dishes in front of you at a time. Long-term memory is the kitchen — it can store unlimited recipes. But every new dish must be assembled at that small table before it can be filed away in the kitchen.

Intrinsic load — the complexity you cannot remove

Intrinsic load is determined by the nature of the material and the learner’s prior knowledge. It depends on element interactivity — how many pieces of information must be processed simultaneously to understand the concept.²

Learning vocabulary in a foreign language has low element interactivity: each word can be learned independently. Learning grammar has high element interactivity: you must hold subject, verb, tense, agreement, and word order in mind at the same time. You cannot reduce intrinsic load by simplifying the instruction — the complexity is in the content itself.

However, you can manage intrinsic load by sequencing: teach low-interactivity elements first, then gradually introduce the interactions between them. This is why prerequisites matter — they reduce the intrinsic load of later material by converting some elements from “things to hold in working memory” to “things already stored in schemas.”¹

Example (click to expand)

Consider teaching someone how a web server processes a request. The full picture involves DNS resolution, TCP handshaking, HTTP parsing, routing, middleware, database queries, and response formatting — high element interactivity. A sequenced approach teaches each component in isolation first, then assembles the full picture once each piece is already in long-term memory as a schema.

Extraneous load — the waste you must eliminate

Extraneous load is cognitive effort that does not contribute to learning. It comes from how material is presented, not from the material itself. Sweller and his colleagues identified several effects that demonstrate extraneous load in action:³

The split-attention effect occurs when learners must mentally integrate information from two or more separated sources that only make sense together. For example, a diagram on one page and its explanatory text on another forces the learner to hold part of the diagram in working memory while searching for the relevant text. Physically integrating the text into the diagram eliminates this unnecessary load.³

The redundancy effect occurs when the same information is presented in multiple forms simultaneously. If a diagram is self-explanatory and the accompanying text simply describes what the diagram already shows, the redundant text actually hinders learning — the learner wastes working memory processing the same information twice and reconciling the two representations.³

Think of it like...

Imagine assembling furniture with instructions that show a clear diagram on page 5 and repeat the same steps in dense text on page 12. You waste time flipping back and forth, checking if the text adds anything new. It doesn’t — but verifying that costs you mental effort you could have spent on actually understanding the assembly.

Germane load — the effort that builds understanding

Germane load is the cognitive effort devoted to constructing and automating schemas. It is the “good” kind of effort — the work of organising, connecting, and integrating new information with existing knowledge in long-term memory.²

When a learner encounters a worked example and studies the solution steps, they are investing germane load: building a schema for how that type of problem is solved. When they later encounter a similar problem, the schema allows them to recognise the pattern and solve it with less working memory demand.⁴

The worked-example effect demonstrates this directly. Novice learners who study worked examples learn more effectively than those who attempt to solve equivalent problems on their own. Problem-solving imposes heavy extraneous load on novices (searching for strategies, trying dead ends), leaving little capacity for schema building. Worked examples redirect that capacity toward germane processing.⁴

Concept to explore

See schema-theory for how schemas are built, modified, and sometimes fail — the cognitive structures that germane load is designed to construct.

The balance — a zero-sum game

The three types of load are additive: intrinsic + extraneous + germane = total load, and total load must not exceed working memory capacity. This means that reducing extraneous load frees up capacity for germane load. It also means that even well-designed instruction can fail if intrinsic load is too high — the material must be sequenced so that each step falls within working memory limits.¹

This is why the same instructional approach can work brilliantly for one learner and fail for another. A learner with strong prior knowledge has already automated many elements into schemas, reducing their intrinsic load. A novice faces the full intrinsic load of every element. The same instruction puts very different demands on their working memories — a phenomenon formalised as the expertise reversal effect.⁵

Key distinction

Intrinsic load is about the content. Extraneous load is about the delivery. Germane load is about the learning. You cannot change intrinsic load directly, but you can manage it. You should always minimise extraneous load. And the point of minimising extraneous load is to make room for germane load.

Why do we use it?

Key reasons

1. Designing instruction that works with the brain, not against it. CLT explains why some lesson designs succeed and others fail: they either respect or ignore the limits of working memory. Following CLT principles leads to measurably better learning outcomes.¹

2. Diagnosing why learners are struggling. When learners fail, the instinct is often to simplify the content. CLT reveals that the problem may not be content complexity (intrinsic load) but poor presentation (extraneous load). Fixing the delivery can unlock learning without dumbing down the material.³

3. Adapting instruction to expertise level. CLT explains why instruction must change as learners progress. What helps a novice (worked examples, integrated formats) can hinder an expert (redundancy, unnecessary scaffolding). This is the expertise reversal effect.⁵

When do we use it?

When designing learning materials — courses, tutorials, documentation — and deciding how to sequence and format content
When a learner is struggling despite clear content — the problem may be extraneous load, not difficulty
When deciding whether to use diagrams, text, or both — CLT’s split-attention and redundancy effects give clear guidance
When building progressive learning paths with appropriate granularity — sequencing to manage intrinsic load
When adapting materials for different skill levels — novices and experts need fundamentally different approaches

Rule of thumb

If a learner is overwhelmed, don’t immediately simplify the content — first check whether the presentation is creating unnecessary cognitive load. Remove the noise before reducing the signal.

How can I think about it?

The kitchen counter analogy

Imagine you are cooking a complex meal. Your kitchen counter (working memory) is small — you can only have four or five items on it at a time. The recipe’s inherent complexity (intrinsic load) determines how many ingredients and tools you need at each step. A messy kitchen with tools scattered in the wrong places (extraneous load) wastes your limited counter space on searching and reorganising. The actual cooking — combining ingredients, tasting, adjusting (germane load) — is where the meal gets made.

A good recipe manages this: it tells you to prep ingredients before you start, puts related steps together, and doesn’t ask you to juggle ten things at once. A bad recipe has you flipping between pages, hunting for measurements buried in paragraphs of prose, and starting three sub-recipes simultaneously.

Kitchen counter = working memory (limited space)

Recipe complexity = intrinsic load (inherent to the dish)

Messy kitchen / badly written recipe = extraneous load (fixable)

Actual cooking and tasting = germane load (productive effort)

Pantry and refrigerator = long-term memory (large capacity storage)

The signal-to-noise analogy

Think of a radio broadcast. The signal is the information you want to hear. Noise is static, interference, and crosstalk from other stations. The radio’s receiver (working memory) can only process a limited bandwidth. If the noise is too loud, you can’t hear the signal — even if the broadcast itself is perfectly clear.

Cognitive Load Theory says: don’t blame the listener for not understanding a noisy broadcast. Reduce the noise (extraneous load), tune the frequency carefully (manage intrinsic load), and make sure the signal is clear enough to be worth listening to (germane load).

Signal = germane load (meaningful learning content)

Noise = extraneous load (poor design, irrelevant information)

Broadcast complexity = intrinsic load (inherent difficulty)

Receiver bandwidth = working memory capacity

Recorded archive = long-term memory

Concepts to explore next

Concept	What it covers	Status
schema-theory	How mental frameworks are built and modified — the target of germane load	complete
novice-expert-spectrum	How expertise changes which instructional strategies work	stub
knowledge-granularity	How to decompose knowledge into learnable units that manage intrinsic load	stub
constructivism	The learning theory that frames knowledge as actively constructed	complete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.

Check your understanding

Test yourself (click to expand)

Explain the difference between intrinsic, extraneous, and germane cognitive load. Why is the distinction important for instructional design?

Describe the split-attention effect and the redundancy effect. Give an original example of each that does not appear in this card.

Distinguish between reducing cognitive load and managing cognitive load. Why is the goal not simply to make everything as easy as possible?

Interpret this scenario: a software tutorial presents a code example on one screen and its line-by-line explanation in a separate pop-up window. Using CLT, predict the effect on learners and suggest a redesign.

Connect cognitive load theory to schema-theory. How does schema formation reduce the intrinsic load of complex tasks over time?

Where this concept fits

Position in the knowledge graph
graph TD
    COG[Cognitivism] --> CLT[Cognitive Load Theory]
    COG --> ST[Schema Theory]
    CLT --> NES[Novice-Expert Spectrum]
    CLT --> KG[Knowledge Granularity]
    style CLT fill:#4a9ede,color:#fff
Related concepts:

schema-theory — schemas are the long-term memory structures that germane load builds; CLT explains why schema construction is constrained by working memory

novice-expert-spectrum — CLT predicts the expertise reversal effect, where instruction that helps novices hinders experts

knowledge-granularity — decomposing knowledge into appropriately sized units is a direct application of managing intrinsic load

Explorer

Cognitive Load Theory

Cognitive Load Theory

What is it?

At a glance

How does it work?

Working memory — the bottleneck

Intrinsic load — the complexity you cannot remove

Extraneous load — the waste you must eliminate

Germane load — the effort that builds understanding

The balance — a zero-sum game

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Graph View

Table of Contents

Backlinks

Explorer

Cognitive Load Theory

Cognitive Load Theory

What is it?

At a glance

How does it work?

Working memory — the bottleneck

Intrinsic load — the complexity you cannot remove

Extraneous load — the waste you must eliminate

Germane load — the effort that builds understanding

The balance — a zero-sum game

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Footnotes

Graph View

Table of Contents

Backlinks