Embeddings

Numerical representations of text, images, or audio as lists of numbers (vectors) in high-dimensional space, where similar meanings are placed close together and different meanings are placed far apart.


What is it?

Computers do not understand words the way humans do. To a computer, the word “dog” is just a sequence of characters — d, o, g — with no inherent meaning. It has no idea that “dog” is closer in meaning to “puppy” than to “democracy.” Embeddings solve this problem by translating meaning into numbers.1

An embedding is a vector — a list of numbers — that represents a piece of content (a word, a sentence, a paragraph, an image) in a way that captures its meaning. The key property is geometric: items with similar meanings end up as vectors that are close together in space, while items with different meanings end up far apart.2 The word “king” and the word “queen” would have vectors that are near each other, because they share many semantic properties (royalty, authority, leadership). The word “bicycle” would be far away from both.

These vectors are produced by embedding models — neural networks trained on massive amounts of text (or images, or audio) to learn which concepts are related and how. The model does not follow hand-written rules about meaning; it learns patterns from data. After training, you pass any text into the model, and it returns a vector — typically a list of hundreds or thousands of numbers — that encodes the meaning of that text.3

The parent concept, machine-readable-formats, covers how data is structured for machines to process. Embeddings are a specific kind of machine-readable format: instead of encoding data as key-value pairs (JSON) or rows and columns (CSV), they encode meaning as coordinates in a mathematical space. This makes them uniquely powerful for tasks where you need to compare meanings rather than match exact strings.

In plain terms

Embeddings are like GPS coordinates for meaning. Just as GPS turns a physical location (“the Eiffel Tower”) into numbers (48.8584, 2.2945) that a computer can work with, an embedding turns a concept (“royal female leader”) into a list of numbers that captures what it means. Nearby coordinates mean nearby meanings.


At a glance


How does it work?

1. Vectors — lists of numbers that encode meaning

A vector is simply an ordered list of numbers. A two-dimensional vector might look like [0.2, 0.8]. A real embedding vector from a modern model has hundreds or thousands of dimensions — for example, OpenAI’s text-embedding-3-small produces vectors with 1,536 numbers.3

Each number in the vector represents some learned aspect of meaning. Unlike a JSON key where you know exactly what "population": 140000 means, individual embedding dimensions do not have human-readable labels. Dimension 47 might partially encode “formality,” dimension 312 might partially encode “scientific domain” — but these are patterns the model learned, not categories a human defined. The meaning emerges from all the numbers taken together.2

Think of it like...

A colour code. The hex colour #FF6B35 means nothing if you look at each character individually, but together they specify an exact shade of orange. Similarly, each number in an embedding means little on its own, but together they specify an exact shade of meaning.

2. The geometry of meaning — close means similar

The power of embeddings comes from a simple geometric principle: distance equals difference in meaning.1

If you plot embedding vectors in space (imagining we could see hundreds of dimensions), you would find clusters. Words about cooking would cluster together. Words about finance would form their own cluster. Words about medicine would form another. And within each cluster, more closely related concepts would sit closer together — “sauteing” near “frying,” both near “cooking,” all far from “mortgage.”4

This clustering is not programmed by hand. It emerges from training. The embedding model reads billions of sentences and learns that “sauteing” and “frying” appear in similar contexts (near words like “pan,” “oil,” “heat”), so it places their vectors close together.

3. How embedding models are created

An embedding model is a neural network trained on large amounts of text. The training process works roughly like this:3

  1. The model reads billions of sentences from books, websites, and articles
  2. It learns to predict which words appear near each other (context prediction)
  3. Words that frequently appear in similar contexts get similar vectors
  4. After training, the model can produce a vector for any new text it has never seen before

Modern embedding models (like those from OpenAI, Cohere, or open-source models on Hugging Face) go beyond individual words. They embed entire sentences or paragraphs, capturing the meaning of the full passage rather than just individual terms.3

Think of it like...

Learning a language by immersion. If you hear the word “gatto” every time someone points at a cat, pets a cat, or feeds a cat, you learn that “gatto” means cat — without anyone giving you a dictionary. Embedding models learn meaning the same way: by observing which words appear in which contexts, millions of times over.

Traditional keyword search matches exact words. If you search for “how do I fix a broken window,” it looks for documents containing those exact terms. A document titled “Glass Repair Guide” might not match at all, because it does not contain the words “fix,” “broken,” or “window.”4

Embedding-based search (semantic search) works differently. It converts both the query and every document into vectors, then finds the documents whose vectors are closest to the query vector. “How do I fix a broken window” and “Glass Repair Guide” would have nearby vectors because they are about the same thing — even though they share no words in common.1

This is the foundation of rag (Retrieval-Augmented Generation), the pattern where an LLM retrieves relevant documents from a knowledge base before generating a response. The retrieval step uses embeddings to find documents that are semantically relevant to the user’s question, not just keyword matches.4

5. Dimensions and distance measures

Two common ways to measure how close two vectors are:2

  • Cosine similarity measures the angle between two vectors. A score of 1.0 means identical direction (identical meaning), 0.0 means unrelated, and -1.0 means opposite. This is the most common measure for text embeddings because it ignores vector length and focuses purely on direction.
  • Euclidean distance measures the straight-line distance between two points. Smaller distance means more similar. This is more intuitive geometrically but can be affected by vector magnitude.

In practice, most embedding-based search systems use cosine similarity because it is robust and fast to compute.2

Key distinction

Embeddings represent meaning as geometry. Cosine similarity measures that geometry. Together, they let you answer the question “how similar are these two pieces of text?” with a number — no keyword matching required.


Why do we use it?

Key reasons

1. Semantic understanding. Embeddings let machines compare meanings, not just strings. “Automobile” and “car” are recognised as near-identical even though they share no characters. This is foundational for search, recommendation, and classification.1

2. Language-agnostic matching. Multilingual embedding models place “dog,” “chien,” and “Hund” near each other in vector space. You can search in English and find relevant documents written in French or German.3

3. Efficiency at scale. Comparing two vectors is a simple mathematical operation that takes microseconds. This makes it possible to search millions of documents in real time — something that would be impossible if every comparison required an LLM call.2

4. Foundation for RAG. Retrieval-Augmented Generation — the dominant pattern for grounding LLM responses in real data — depends entirely on embeddings to find the right documents to feed to the model.4


When do we use it?

  • When building semantic search that finds results by meaning, not just keywords
  • When implementing RAG to ground LLM responses in relevant documents from a knowledge base
  • When you need to classify or cluster text (group similar support tickets, detect duplicate questions)
  • When building recommendation systems (find articles similar to ones a user liked)
  • When you need to compare text across languages without translation

Rule of thumb

If the task requires understanding what text means rather than what words it contains, embeddings are almost certainly part of the solution.


How can I think about it?

The library with invisible shelving

Imagine a library where books are not shelved alphabetically or by genre, but by meaning. Books about cooking sit next to books about nutrition, which sit next to books about food science, which sit next to books about chemistry. A book about Italian cooking would be on the same shelf as a book about making pasta from scratch, even though their titles share no words.

  • Each book’s position = its embedding vector (coordinates in the library)
  • Nearby books = semantically similar content
  • Finding a book = computing the vector for your query and walking to that spot in the library
  • The shelving system = the embedding model that decided where to place each book
  • No card catalogue needed = no keyword index, because proximity is the index

This library would be useless for humans (you cannot see 1,536 dimensions), but it is exactly how a computer navigates a knowledge base using embeddings.

The colour wheel of language

Think of the colour wheel. Red, orange, and yellow are neighbours — they blend smoothly into each other. Red and green are on opposite sides — maximally different. You do not need to describe a colour in words to know how similar it is to another colour; you just check their positions on the wheel.

  • Each colour = a word or sentence
  • Position on the wheel = its embedding vector
  • Nearby colours blend = similar meanings cluster together
  • The wheel has many dimensions = a real embedding space has hundreds of axes, not just hue and saturation, capturing nuances like formality, domain, sentiment, and topic simultaneously
  • Mixing colours = vector arithmetic (king - man + woman = queen is like mixing hues to get a new colour)

Embeddings extend the colour wheel idea to language: every piece of text gets a position, and you navigate meaning by moving through the space.


Concepts to explore next

ConceptWhat it coversStatus
ragRetrieving relevant documents via embeddings before generating a responsestub
vector-databasesSpecialised databases optimised for storing and querying embedding vectors at scalestub
knowledge-graphsStructured representations of relationships between concepts, complementary to embeddingsstub

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    KE[Knowledge Engineering] --> MRF[Machine-Readable Formats]
    MRF --> JSON[JSON]
    MRF --> EMB[Embeddings]
    MRF --> SDvP[Structured Data vs Prose]
    style EMB fill:#4a9ede,color:#fff

Related concepts:

  • rag — embeddings are the retrieval mechanism that powers RAG, finding semantically relevant documents for an LLM to use as context
  • vector-databases — specialised storage systems built to index, store, and query embedding vectors at scale
  • knowledge-graphs — represent relationships explicitly as nodes and edges, complementing the implicit relationship encoding of embeddings
  • json — while JSON encodes data as key-value pairs for deterministic parsing, embeddings encode meaning as vectors for similarity comparison

Sources


Further reading

Resources

Footnotes

  1. Raghavan, P. (2026). What Are AI Embeddings? A Plain-English Guide. Awesome Agents. 2 3 4

  2. PractiqAI. (2025). Embeddings in Plain English. PractiqAI. 2 3 4 5

  3. Singh, T. (2026). What Are Vector Embeddings? A Complete Guide. mem0. 2 3 4 5

  4. Stack Overflow. (2023). An Intuitive Introduction to Text Embeddings. Stack Overflow Blog. 2 3 4

  5. Allen, C. and Hospedales, T. (2019). King - man + woman = queen: the hidden algebraic structure of words. University of Edinburgh School of Informatics. 2