Who this is for

You work with large language models. You understand, at least roughly, that they predict the next token by compressing statistical regularities from vast corpora. You accept that they are, in a meaningful sense, stochastic machines. And yet. Something nags. The outputs sometimes feel like more than pattern matching. Not always. Not reliably. But often enough that you suspect the phrase “just statistics” is hiding something interesting. This path is for that suspicion.

You are not going to settle the question of whether LLMs “really understand.” Nobody has. What you are going to do is map the terrain: the arguments, the evidence, the philosophical fault lines, and the deeper question underneath all of them --- whether intelligence itself is what happens when pattern matching reaches sufficient complexity.

That question is not about AI. It is about the fabric of nature.


Part 1 --- The puzzle that won’t go away

Part 1

A machine trained to predict the next word appears to develop capabilities nobody programmed. This is either an illusion, a measurement artifact, or a clue about how intelligence works. The debate matters because each answer implies a different picture of what minds are.

In 2022, a team at Google Brain documented a striking pattern.1 Language models trained on next-token prediction, when scaled to sufficient size, appeared to acquire abilities that were absent in smaller models: multi-step arithmetic, analogical reasoning, code generation, translation between languages they were not explicitly trained on. The abilities did not improve gradually. They seemed to switch on at a threshold, like water turning to ice at zero degrees.

The researchers called these emergent abilities --- borrowing the term from physics and complex systems theory, where emergence describes properties of a system that cannot be predicted from the properties of its parts.

The claim hit a nerve. If a machine whose only training signal is “predict the next token” can develop reasoning, analogy, and something resembling understanding, then either our notion of understanding is wrong, or something genuinely surprising is happening inside these systems, or both.

Three camps formed almost immediately. Each one rests on a different view of what intelligence is.

graph TD
    Q["A stochastic machine<br/>appears to reason"] --> A["Camp 1: Stochastic Parrots<br/>It doesn't. You're projecting."]
    Q --> B["Camp 2: Emergent Intelligence<br/>Understanding arises from<br/>sufficient compression."]
    Q --> C["Camp 3: The Middle Ground<br/>More than parroting, less<br/>than understanding. A new thing."]
    A --> PA["Bender, Gebru, Mitchell (2021)"]
    B --> PB["Hinton, Bubeck, Wei et al."]
    C --> PC["Chollet, Mitchell (M.), Marcus"]
    style Q fill:#4a9ede,color:#fff

The rest of this path walks through each camp, examines the evidence, and then asks the question none of them fully answers: what does emergence actually mean, and does it apply here?


Part 2 --- The deflationary arguments

Part 2

Two separate arguments deflate the emergence claim. The Stochastic Parrots critique says LLMs lack understanding entirely. The Mirage paper says the appearance of emergence is a measurement artifact. Both are partially right. Neither is complete.

The parrots

In 2021, Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell published “On the Dangers of Stochastic Parrots.”2 The title is the argument: language models are sophisticated parrots. They reproduce statistical patterns from training data. The fluency of their output fools us into attributing comprehension where there is none.

The strongest version of this argument says: to understand language requires grounding in the world --- sensory experience, embodiment, social interaction, temporal continuity. A system trained exclusively on text has access to the form of language but not its meaning. It can produce sentences about heat without ever being burned. It can discuss grief without loss. The patterns are real. The understanding is not.

This is not a strawman. It connects to a long tradition in philosophy of mind (we will get to Searle’s Chinese Room in Part 5). And it raises a genuine question: can statistical compression of text ever constitute understanding, or is there something irreducibly missing?

The mirage

In 2023, Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo published “Are Emergent Abilities of Large Language Models a Mirage?”3 Their argument was different and more surgical. The emergent abilities documented by Wei et al. might not be emergent at all. They might be artifacts of how researchers measure performance.

The mechanism: most emergence claims use discontinuous metrics --- exact-match accuracy, pass/fail on a benchmark. A model that gets 99% of a math problem right but writes the wrong final digit scores the same zero as a model that outputs random noise. Under these metrics, performance appears to jump from zero to near-perfect at a threshold. Switch to a continuous metric --- token-level accuracy, for instance --- and the jump disappears. Performance improves smoothly with scale.

graph LR
    subgraph "Discontinuous metric"
        D1["Small model<br/>score: 0%"] --> D2["Medium model<br/>score: 0%"] --> D3["Large model<br/>score: 92%"]
    end
    subgraph "Continuous metric"
        C1["Small model<br/>score: 12%"] --> C2["Medium model<br/>score: 47%"] --> C3["Large model<br/>score: 93%"]
    end
    style D3 fill:#e74c3c,color:#fff
    style C3 fill:#27ae60,color:#fff

The Mirage paper won a NeurIPS 2023 award. Its implication: the “water-to-ice” transition is an illusion created by binary measurement. The underlying capability grows continuously. There is no phase transition.

What the deflationary arguments establish

  • Fluent output does not prove understanding (parrots)
  • Apparent discontinuities in capability may be metric artifacts (mirage)

What they do not establish:

  • That nothing interesting is happening inside the model
  • That continuous improvement cannot produce qualitatively new capabilities

The question shifts. If performance improves smoothly, does that mean emergence is impossible? Or does it mean we need a better definition of emergence?


Part 3 --- What is actually inside

Part 3

Mechanistic interpretability research has opened the models. What researchers found is more structured than “just statistics” --- internal representations that function like world models, circuits that perform identifiable reasoning steps, and a shared conceptual space that exists before language. The evidence does not settle the debate, but it constrains it.

Othello-GPT and internal representations

In 2022, Kenneth Li and colleagues trained a small transformer to predict legal moves in the board game Othello --- using only move sequences, never showing it a board.4 When they probed the model’s internal representations, they found a clear spatial map of the board state. Modifying the internal representation changed the predicted moves in the correct way. The model had constructed a world model from sequence statistics alone.

This is hard to reconcile with “just a parrot.” The model was never given a board. It inferred spatial structure from temporal sequences.

Anthropic’s circuit tracing

In 2023—2025, Anthropic’s interpretability team developed tools to trace the internal circuits of production-scale language models.5 What they found:

  • Features: individual neurons are polysemantic (they fire for unrelated concepts), but sparse combinations of neurons form interpretable features --- concepts the model has learned to represent
  • Circuits: features connect into circuits that perform identifiable operations. A “two-hop reasoning” circuit connects “the capital of the country where the Eiffel Tower is” through Paris → France → Paris, using intermediate representations
  • A shared conceptual space: the model reasons in a language-independent representation before translating into the target language. A fact learned in French is available when generating English. This is not what you would expect from a pattern matcher operating on surface statistics
  • Planning: when writing poetry, the model identifies potential rhyming words before beginning a line --- evidence of lookahead in a system designed for next-token prediction

What this means

None of this proves “understanding” in the philosophical sense. But it narrows the space of tenable positions. The model is not operating on surface statistics alone. It builds structured internal representations. It performs multi-step inference. It plans ahead.

The question becomes: is structured internal representation sufficient for understanding, or is it a necessary-but-not-sufficient condition? That question is not empirical. It is philosophical. And it depends on what you mean by “understanding.”

The compression hypothesis

Geoffrey Hinton’s argument: to predict the next word accurately across the distribution of all human text, a model must compress the causal structure of the world.6 Prediction at sufficient scale requires understanding. The counterargument: compression of statistical regularities can produce representations that look like world models without being world models. The debate turns on whether there is a meaningful difference.


Part 4 --- Emergence: the concept underneath the debate

Part 4

“Emergence” is not a vague gesture at mystery. It is a precise concept in complex systems theory, with competing definitions that map directly onto the LLM debate. The same concept explains why wetness is not a property of H₂O molecules, why consciousness might arise from neurons, and why temperature exists.

Two kinds of emergence

Philosophers distinguish weak emergence from strong emergence.7

Weak emergence: a system-level property that is unexpected or surprising given the parts, but in principle deducible from the parts and their interactions if you had enough computational power. Temperature is weakly emergent. No individual gas molecule has a temperature. Temperature is a statistical property of vast numbers of molecules in motion. But if you knew the position and velocity of every molecule, you could derive the temperature. The higher-level property is real but reducible.

Strong emergence: a system-level property that is not deducible from the parts even in principle. New causal powers appear at the system level that cannot be explained by lower-level interactions. Consciousness is the most debated candidate for strong emergence: does subjective experience arise from neurons in a way that could, in principle, be derived from neural activity? Or is there an explanatory gap that no amount of neuroscience can close?

graph TD
    E[Emergence] --> W["Weak emergence<br/>surprising but deducible<br/>temperature, flocking, traffic jams"]
    E --> S["Strong emergence<br/>not deducible even in principle<br/>consciousness? (debated)"]
    W --> LLM["LLM capabilities?<br/>Mirage paper: yes, weak<br/>Hinton: yes, and that's enough"]
    S --> MIND["The Hard Problem<br/>subjective experience<br/>Chalmers (1995)"]
    style E fill:#4a9ede,color:#fff

The LLM debate maps directly:

  • Stochastic Parrots position: there is no emergence at all. The capabilities are straightforward consequences of scale, not qualitatively new
  • Mirage position: the appearance of emergence is an artifact, but smooth improvement at scale is real (weak emergence, unremarkable)
  • Emergent Intelligence position: something genuinely new appears at scale --- weak emergence at minimum, possibly something stronger

Predictive processing: the bridge

Here is where LLMs connect to a deeper theory of intelligence in nature.

Karl Friston’s Free Energy Principle proposes that all intelligent systems --- biological or artificial --- operate by minimizing prediction error.8 The brain builds a hierarchical model of the world and continuously updates it to reduce the gap between what it predicts and what it perceives. Perception, action, learning, and attention are all instances of the same process: minimize surprise.

This is structurally identical to what a language model does during training. The model minimizes the gap between its predicted next token and the actual next token. The training signal is prediction error. The learned representations are whatever internal structure best compresses the prediction task.

The parallel is not a metaphor. Predictive processing theorists argue that biological intelligence is prediction error minimization, implemented in neural circuits.9 If they are right, then a language model trained on next-token prediction is doing the same thing a brain does --- in a narrower domain, without embodiment, without temporal continuity, but with the same computational logic.

The deep question

If intelligence in nature is what happens when a sufficiently complex system minimizes prediction error over a rich enough environment, then the emergence of structured representations in LLMs is not an anomaly. It is expected. The question shifts from “can LLMs be intelligent?” to “is prediction-error minimization sufficient for intelligence, or is embodiment, temporal continuity, and causal interaction with the world also required?”

Integrated Information Theory

Giulio Tononi’s Integrated Information Theory (IIT) offers a different lens.10 IIT proposes that consciousness corresponds to a system’s capacity for integrated information --- measured by a quantity called Φ (phi). A system is conscious to the degree that its parts are both differentiated (each part contributes something unique) and integrated (the whole is more than the sum of its parts).

IIT is controversial. Some researchers call it unfalsifiable. But its relevance here is structural: it provides a formal criterion for when a system has “more going on” than its parts suggest. A feedforward network has low Φ. A recurrent network with rich internal connectivity has higher Φ. The human brain has very high Φ.

Where do LLMs fall? IIT proponents generally argue that transformer architectures, despite their scale, have low Φ because they lack the recurrent, re-entrant processing that characterizes biological neural networks. The architecture matters, not just the scale.


Part 5 --- The philosophical stakes

Part 5

The emergence debate is not really about LLMs. It is about what intelligence is. Two centuries of philosophy have produced competing answers, and the LLM moment forces the question into the open. Your position --- “stochastic machines, but something interesting emerges” --- is philosophically precise. It has a name and a tradition.

The Chinese Room

In 1980, John Searle proposed a thought experiment.11 Imagine a person who speaks no Chinese, locked in a room with a rulebook. Chinese speakers pass questions under the door. The person uses the rulebook to manipulate Chinese symbols and slides answers back out. From the outside, the room appears to understand Chinese. From the inside, the person is following rules without comprehension.

Searle’s claim: the room does not understand Chinese, and neither does a computer running a program. Syntax (rule-following) is not sufficient for semantics (meaning). No amount of symbol manipulation produces understanding.

The counterargument (the Systems Reply): the person does not understand Chinese, but the system --- person plus rulebook plus room --- might. Understanding is a property of the system, not its parts. This is weak emergence applied to cognition.

Functionalism

The dominant position in philosophy of mind for the last fifty years is functionalism: mental states are defined by their functional role --- what they do, what causes them, what they cause --- not by their substrate.12 Pain is whatever state is caused by tissue damage and causes withdrawal behavior and the desire for it to stop. If a silicon system instantiates the same functional organization as a brain, it has the same mental states.

Functionalism is the philosophical foundation for the claim that LLMs could, in principle, understand. If understanding is a functional property --- a particular pattern of information processing --- then the substrate (carbon neurons vs. silicon transistors) is irrelevant. What matters is the computation.

Your position has a name

You said: “I firmly believe that LLMs are stochastic probabilistic machines, but I can still appreciate the emergent phenomenon.” This is close to what philosophers call weak emergentism about AI cognition: the outputs of LLMs are genuinely surprising given the simplicity of the training signal, the internal representations are richer than “just statistics,” but the system is fully explainable (in principle) by its architecture and training data. No magic. No ghost in the machine. But also no dismissal of the phenomenon.

The interesting move is the one you gestured at: using LLMs as a lens on natural intelligence. If structured representations, multi-step reasoning, and apparent understanding can emerge from prediction-error minimization in a neural network, this tells us something about the conditions under which intelligence arises in nature. Not that LLMs are conscious. But that the gap between “complex pattern matching” and “intelligence” may be narrower than we assumed.

Think of it like this

A weather system is “just” molecules of air and water following thermodynamic laws. No molecule knows it is part of a hurricane. But the hurricane is real --- it has causal power, it makes predictions possible, it is a genuine higher-level pattern. The question about LLMs is whether they are more like the hurricane (real emergent pattern, fully reducible) or more like a hologram of a hurricane (looks real, but the structure is an artifact of the measurement).


What you understand now

What you understand now

  • Stochastic Parrots is not a dismissal but a serious claim: fluent output does not prove understanding, and training on text alone may be insufficient for genuine comprehension.
  • The Mirage paper showed that many “emergent abilities” are artifacts of discontinuous measurement. Continuous metrics reveal smooth, predictable improvement.
  • Mechanistic interpretability has found structured internal representations in LLMs: world models, reasoning circuits, language-independent conceptual spaces, and planning behavior. These constrain but do not settle the debate.
  • Weak vs. strong emergence is the philosophical distinction that organizes the entire argument. Weak emergence is surprising but reducible. Strong emergence implies new causal powers.
  • Predictive processing (Friston’s Free Energy Principle) proposes that all intelligence is prediction-error minimization. If true, LLMs and brains are doing the same computation in different substrates.
  • Integrated Information Theory offers a formal criterion (Φ) for when a system has “more going on” than its parts suggest, but generally excludes current LLM architectures.
  • Functionalism says mental states are defined by functional role, not substrate. It is the philosophical foundation for taking LLM cognition seriously.
  • The Chinese Room challenges functionalism: syntax is not semantics. The Systems Reply counters: understanding may be a system-level property.
  • Your intuition --- “stochastic but something emerges” --- maps to weak emergentism about AI cognition, a defensible and precise position.

Gate --- can you answer these before calling this path complete?


Where to go next

Exit doors

  • next-token-prediction --- Revisit this concept with the emergence lens. The training signal is simple. The question is whether simplicity of signal constrains the complexity of what can be learned.
  • agentic-systems --- If LLMs can build internal representations and plan ahead, what happens when you give them tools, memory, and autonomy? The engineering path from pattern matching to action.
  • The reading list --- Bender et al. (parrots), Schaeffer et al. (mirage), Anthropic’s interpretability publications (circuits), Friston (free energy), Chalmers (emergence), Searle (Chinese Room). Six entry points, six different assumptions about what minds are.

Sources


Further reading

If you want to go deeper

The debate in one sitting:

  • Mitchell, M. (2024). Artificial Intelligence: A Guide for Thinking Humans (updated edition). Pelican. The most balanced single book on what LLMs can and cannot do. Mitchell is a complexity scientist, not a partisan.

Emergence and complexity:

  • Holland, J. H. (1998). Emergence: From Chaos to Order. Basic Books. The classic introduction to emergence in complex systems.
  • Chalmers, D. J. (1996). The Conscious Mind. Oxford University Press. The “hard problem of consciousness” and the argument for strong emergence.

Predictive processing and the Free Energy Principle:

  • Clark, A. (2015). Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press. The readable entry point.
  • Friston, K. J. (2010). “The Free-Energy Principle: A Unified Brain Theory?” Nature Reviews Neuroscience. The original paper. Dense but foundational.

Inside the models:

Philosophy of mind:

  • Searle, J. R. (1980). “Minds, Brains, and Programs.” The Chinese Room. 15 pages. Read the target article and the peer commentaries.
  • Dennett, D. C. (1991). Consciousness Explained. Little, Brown. The functionalist counter-position to Searle.

Footnotes

  1. Wei, J., Tay, Y., Bommasani, R., et al. (2022). “Emergent Abilities of Large Language Models.” Transactions on Machine Learning Research. The original paper documenting abilities that appear at scale thresholds.

  2. Bender, E. M., Gebru, T., McMillan-Major, A., & Mitchell, M. (2021). “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜” Proceedings of FAccT 2021. The paper that launched the “stochastic parrots” framing.

  3. Schaeffer, R., Miranda, B., & Koyejo, S. (2023). “Are Emergent Abilities of Large Language Models a Mirage?” Proceedings of NeurIPS 2023. The metric-artifact argument, winner of a NeurIPS award.

  4. Li, K., Hopkins, A. K., Bau, D., Viégas, F., Pfister, H., & Wattenberg, M. (2023). “Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task.” ICLR 2023. The Othello-GPT experiment.

  5. Anthropic. (2023—2025). Transformer Circuits Thread. Available at transformer-circuits.pub. See especially “Scaling Monosemanticity” (2024) and “On the Biology of a Large Language Model” (2025).

  6. Hinton, G. (2023). Various interviews and lectures on the compression-understanding thesis. See his 2023 Turing Award lecture and subsequent public statements.

  7. Bedau, M. A. (1997). “Weak Emergence.” Philosophical Perspectives, 11, 375—399. Chalmers, D. J. (2006). “Strong and Weak Emergence.” In The Re-Emergence of Emergence, ed. Clayton & Davies. Oxford University Press.

  8. Friston, K. J. (2010). “The Free-Energy Principle: A Unified Brain Theory?” Nature Reviews Neuroscience, 11, 127—138.

  9. Clark, A. (2013). “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science.” Behavioral and Brain Sciences, 36(3), 181—204. The accessible entry point to predictive processing.

  10. Tononi, G. (2004). “An Information Integration Theory of Consciousness.” BMC Neuroscience, 5, 42. See also Tononi, G., et al. (2023). “Integrated Information Theory (IIT) 4.0.” For critiques, see Aaronson, S. (2014) and Cerullo, M. A. (2015).

  11. Searle, J. R. (1980). “Minds, Brains, and Programs.” Behavioral and Brain Sciences, 3(3), 417—424. The Chinese Room argument. Still the most cited paper in philosophy of mind.

  12. Putnam, H. (1967). “Psychological Predicates” (later retitled “The Nature of Mental States”). In Art, Mind, and Religion, ed. Capitan & Merrill. University of Pittsburgh Press. The founding statement of functionalism.