Hallucination

When a language model generates text that sounds confident and plausible but is factually wrong, fabricated, or unsupported by any source.

What is it?

Language models do not retrieve facts from a database. They predict the next most likely word based on statistical patterns learned during training.¹ Most of the time, this produces useful, accurate text. But sometimes the model generates information that is entirely fabricated — a citation that does not exist, a statistic that was never published, a historical event that never happened — delivered with the same confident tone as its accurate output.

This phenomenon is called hallucination, and it is one of the most important limitations of modern AI. The term is borrowed from psychology, where it refers to perceiving something that is not there.² In the AI context, the model “perceives” patterns in its training data and generates plausible continuations, even when no factual basis exists for what it produces.

Hallucination is not a bug that can be patched — it is a structural property of how language models work.¹ Because these models are fundamentally prediction engines, not knowledge retrieval systems, there will always be cases where the most statistically likely next word is not the factually correct one. The rate can be reduced dramatically through architectural mitigations (retrieval-augmented generation, knowledge graphs, structured output, human review), but it cannot be eliminated entirely.

This matters because hallucination is the root problem that motivates half the architecture of modern agentic systems. Retrieval-augmented generation exists to ground model outputs in real documents. Knowledge graphs exist to provide verified facts. Guardrails exist to catch fabricated content. Human-in-the-loop design exists because automated systems cannot always be trusted to be accurate. Understanding hallucination is understanding why these architectural patterns are necessary.³

In plain terms

A language model is like a very well-read student who has read millions of books but never took notes and has no way to look anything up during the exam. When they know the answer, they are brilliant. When they do not, they do not say “I don’t know” — they construct a plausible-sounding answer from fragments of things they half-remember. The answer sounds right, reads well, and might even be right by coincidence. But it was assembled from pattern-matching, not from verified knowledge.

At a glance

Why hallucination happens and how it is mitigated (click to expand)
graph TD
    LLM[Language Model] -->|Predicts next token| OUT[Generated Output]
    OUT -->|May contain| HAL[Hallucinated Content]
    HAL -->|Mitigated by| RAG[RAG - Retrieval]
    HAL -->|Mitigated by| KG[Knowledge Graphs]
    HAL -->|Mitigated by| SO[Structured Output]
    HAL -->|Mitigated by| HITL[Human Review]
    RAG --> GRND[Grounded Output]
    KG --> GRND
    SO --> GRND
    HITL --> GRND
    style HAL fill:#dc2626,color:#fff
    style GRND fill:#16a34a,color:#fff
Key: The language model generates output by predicting tokens. Some of that output may be hallucinated. Four primary mitigation strategies — retrieval-augmented generation, knowledge graphs, structured output validation, and human review — work to ground the output in verified facts. No single strategy eliminates hallucination; effective systems combine multiple approaches.

How does it work?

1. Why models hallucinate

Language models learn by processing vast amounts of text and building a statistical map of which words tend to follow which other words, in which contexts.¹ When you ask a question, the model does not “look up” the answer — it generates the sequence of words that is most probable given the question and its training patterns.

This works remarkably well most of the time. But it breaks down in predictable ways:

Knowledge gaps: The model was not trained on the relevant information, or the information was rare in the training data. It fills the gap with a plausible-sounding completion.²
Outdated information: The model’s training data has a cutoff date. It cannot know about events, publications, or changes that occurred after training.
Conflicting sources: The training data contains contradictory claims. The model may pick the statistically dominant one, which is not always the correct one.
Distributional mismatch: The question requires a type of reasoning or knowledge that differs from what the model encountered during training.⁴

Think of it like...

Imagine a translator who has read millions of documents in two languages but has never studied grammar rules or used a dictionary. Most of the time, their translations are excellent — pure pattern recognition. But occasionally they translate a word into something that sounds right in context but means something completely different. They cannot check their own work because they have no reference material, only internalised patterns.

2. Parametric vs non-parametric knowledge

This distinction is central to understanding hallucination. Parametric knowledge is what the model “knows” from training — information encoded in its billions of parameters (weights). Non-parametric knowledge is external information the model can access at inference time through retrieval systems like RAG.³

Parametric knowledge is fixed, compressed, and lossy. The model cannot tell you where it learned a fact, cannot verify it, and cannot update it without retraining. Non-parametric knowledge is dynamic, verifiable, and citable — the model retrieves a document, reads it, and bases its answer on that source.

Hallucination overwhelmingly comes from reliance on parametric knowledge. When a model generates an answer purely from its parameters, there is no external check. When it retrieves and cites a source document, the answer is grounded and verifiable.⁵

Key distinction

Parametric knowledge = what the model memorised during training (compressed, unverifiable, may be wrong). Non-parametric knowledge = what the model retrieves from external sources at query time (current, verifiable, citable). The shift from parametric to non-parametric knowledge is the core strategy behind RAG and knowledge graph integration.

3. Why it matters

Hallucination is not just an academic concern. It creates real-world consequences:

Trust: Users who encounter a confident but wrong answer lose trust in the entire system. One fabricated citation can undermine a tool’s credibility.²
Safety: In medical, legal, or financial contexts, a hallucinated fact can lead to harmful decisions. A fabricated drug interaction or a non-existent legal precedent is not just wrong — it is dangerous.
Liability: As AI-generated content enters professional workflows, the question of who is responsible for hallucinated content becomes a legal and ethical issue.⁴
Compounding errors: In agentic systems, a hallucinated fact in one step can propagate through subsequent steps, with each step building on the fabricated foundation. An agent that hallucinates a data point, uses it to make a decision, and acts on that decision can cause cascading failures.³

Example: the phantom citation (click to expand)

Consider asking an LLM: “What does the research say about the effectiveness of spaced repetition for language learning?”

The model might respond: “According to Pimsleur (1967) in the Journal of Memory and Language, spaced repetition improves long-term vocabulary retention by 47% compared to massed practice.”

This sounds authoritative. But the specific statistic (47%), the journal name, and the year might all be fabricated. Pimsleur did publish foundational work on spaced repetition, but the model may have assembled a plausible-sounding citation from fragments of real and imagined sources. A reader who does not verify the citation accepts fabricated evidence as fact.

4. Architectural mitigations

Hallucination cannot be eliminated, but its rate and impact can be dramatically reduced through architectural design:³

Retrieval-Augmented Generation (RAG): Instead of relying on parametric knowledge, the system retrieves relevant documents from a trusted corpus and includes them in the prompt. The model bases its answer on the retrieved text, reducing reliance on compressed, lossy memory.⁵

Knowledge graphs: Structured databases of verified facts and relationships. When combined with RAG (a pattern called GraphRAG), knowledge graphs provide not just relevant text but verified, structured data. One study of ontology-grounded knowledge graphs in clinical question answering found hallucination rates as low as 1.7% for ontology-grounded systems, compared to significantly higher rates for ungrounded models.⁶

Structured output: Constraining the model’s output format (JSON schemas, predefined fields, enumerated options) reduces the space for fabrication. A model that must select from a list of verified options cannot hallucinate a non-existent option.

Human-in-the-loop review: For high-stakes applications, human verification remains the most reliable check. The system generates a draft; a human verifies facts, citations, and claims before the output is used.

Concept to explore

See rag for how retrieval-augmented generation works as the primary architectural defence against hallucination.

Why do we use it?

Key reasons to understand hallucination

1. It drives architectural decisions. Understanding hallucination explains why systems like RAG, knowledge graphs, and guardrails exist. You cannot design reliable AI without understanding the failure mode they are designed to prevent.³

2. It calibrates trust. Knowing that language models hallucinate changes how you evaluate their output. You verify claims, check citations, and design workflows that include validation steps — not because the model is bad, but because statistical prediction is not the same as knowledge retrieval.¹

3. It informs risk assessment. The consequences of hallucination vary by domain. In creative writing, a hallucinated detail is harmless. In medical diagnosis, it could be lethal. Understanding hallucination lets you match the level of mitigation to the level of risk.

When do we use it?

When evaluating whether an AI system is reliable enough for a given use case
When designing an agentic system that must produce factually accurate output
When choosing between architectures (pure LLM vs RAG vs knowledge-grounded systems)
When building trust with users by being transparent about AI limitations
When auditing AI-generated content for accuracy before publication or action

Rule of thumb

If the cost of a wrong answer is low (brainstorming, creative writing, drafting), hallucination is a manageable nuisance. If the cost of a wrong answer is high (medical advice, legal analysis, financial decisions, published facts), hallucination is an architectural requirement you must design against — not just hope to avoid.

How can I think about it?

The confident bluffer

Imagine a dinner party guest who has read widely but remembers imprecisely. When the conversation turns to a topic they half-know, they do not pause and say “I’m not sure.” Instead, they confidently state a fact, name a source, and move on. Most of the time they are roughly right — their wide reading gives them good instincts. But sometimes they are dead wrong, and their confidence makes the error more damaging, because everyone at the table believes them.

The dinner party = the user interaction

The guest’s wide reading = the model’s training data

Their imprecise memory = compressed parametric knowledge

Their confident delivery = the model’s inability to express genuine uncertainty

A sceptical listener who fact-checks claims = a RAG system or human reviewer

The solution is not to stop inviting the guest (the model is still enormously useful). It is to have a fact-checker at the table.

The autocomplete that went too far

Think about your phone’s autocomplete. It predicts the next word based on what you have typed and what it has seen in millions of other messages. Most predictions are useful. But sometimes it suggests something absurd — it has no understanding of what you mean, only what statistically comes next.

Now scale that up by a billion. A language model is autocomplete operating at the level of entire paragraphs and documents. It predicts the most likely next word, thousands of times in sequence, and the result is usually coherent and accurate. But occasionally the statistical prediction diverges from factual reality, and the model has no mechanism to notice — because it was never checking facts in the first place. It was always just predicting the next word.

Autocomplete = the model’s generation mechanism

Your message history = the training data

A bizarre autocomplete suggestion = a hallucination

You reviewing before hitting send = human-in-the-loop validation

Concepts to explore next

Concept	What it covers	Status
rag	Retrieving external knowledge to ground model outputs	stub
guardrails	Constraints that prevent agents from producing harmful output	stub
human-in-the-loop	When and how to involve humans in agent decision-making	complete
knowledge-graphs	Structured databases of verified facts and relationships	stub

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.

Check your understanding

Test yourself (click to expand)

Explain why hallucination is a structural property of language models, not a bug that can be patched. What about how these models work makes hallucination inevitable?

Distinguish between parametric and non-parametric knowledge. Which type is more prone to hallucination, and why?

Name four architectural strategies for reducing hallucination and describe how each one works.

Interpret this scenario: a legal research agent cites three court cases to support an argument. Two are real and one is fabricated. What architectural mitigation would most reliably prevent this?

Connect hallucination to the concept of rag: why does grounding a model’s output in retrieved documents reduce (but not eliminate) hallucination?

Where this concept fits

Position in the knowledge graph
graph TD
    AIML[AI and Machine Learning] --> HAL[Hallucination]
    AIML --> KE[Knowledge Engineering]
    AIML --> AS[Agentic Systems]
    HAL -.->|motivates| RAG[RAG]
    HAL -.->|motivates| GR[Guardrails]
    HAL -.->|motivates| HITL[Human-in-the-Loop]
    HAL -.->|motivates| KG[Knowledge Graphs]
    style HAL fill:#4a9ede,color:#fff
Related concepts:

rag — the primary architectural pattern for grounding model outputs in retrieved documents, directly motivated by hallucination

guardrails — constraints and validation layers that catch hallucinated content before it reaches users

human-in-the-loop — human review as the most reliable (and most expensive) check against hallucination

knowledge-graphs — structured knowledge that provides verified, relationship-aware facts to reduce reliance on parametric memory

Explorer

Hallucination

Hallucination

What is it?

At a glance

How does it work?

1. Why models hallucinate

2. Parametric vs non-parametric knowledge

3. Why it matters

4. Architectural mitigations

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Graph View

Table of Contents

Backlinks

Explorer

Hallucination

Hallucination

What is it?

At a glance

How does it work?

1. Why models hallucinate

2. Parametric vs non-parametric knowledge

3. Why it matters

4. Architectural mitigations

Why do we use it?

When do we use it?

How can I think about it?

Concepts to explore next

Check your understanding

Where this concept fits

Sources

Further reading

Footnotes

Graph View

Table of Contents

Backlinks