Vector Databases

Databases designed to store, index, and query high-dimensional vectors (embeddings) so you can search by meaning rather than by exact match.


What is it?

A traditional database answers the question: “does this row match these exact criteria?” A vector database answers a fundamentally different question: “which records are most similar to this?”1 This distinction matters because a growing class of data — documents, images, audio, user behaviour — cannot be searched effectively by exact match. The right query is not “find this,” but “find what is close to this.”

Vector databases store embeddings — lists of numbers that encode the meaning of content — alongside optional metadata. When you search a vector database, you provide a query vector (the embedding of your question or input), and the database returns the stored vectors that are closest to it in high-dimensional space. “Close” means “similar in meaning.” A search for “how do I recover my account” would return results about “password reset” even though the two phrases share no words, because their embeddings sit near each other in vector space.1

The parent concept, databases, covers how data is stored and queried for traditional use cases — exact lookups, filtering, aggregation. Vector databases extend this idea to a domain where exact lookup is the wrong tool. Where a relational database uses a B-tree index to find rows matching a condition, a vector database uses specialised index structures to find vectors that are geometrically close to a query vector.2 The core trade-off is precision: relational databases give you exact answers; vector databases give you approximate answers ranked by similarity, which is exactly what you want when dealing with meaning.

In plain terms

A vector database is like a librarian who understands what you mean, not just what you say. If you ask a regular librarian for “books about fixing broken glass,” they search the card catalogue for those exact words. A vector-database librarian understands you probably want the book titled “Window Repair Guide” even though it does not contain the word “broken” — because it is about the same thing.


At a glance


How does it work?

1. Storing vectors with metadata

Every record in a vector database has three parts: the vector itself (the embedding), a unique identifier, and optional metadata (structured fields like author, date, category, or source URL).1

For example, if you embed a collection of support articles, each record might look like this:

PartExample
IDarticle-0042
Vector[0.12, -0.33, 0.91, ... ] (1,536 dimensions)
Metadata{ "category": "billing", "language": "en", "updated": "2025-11-01" }

The metadata is important because it enables filtered search — combining semantic similarity with traditional attribute matching.

Think of it like...

A music streaming service. Each song has an audio fingerprint (the vector) that captures its “mood,” plus metadata tags like genre, tempo, and release year. When you say “play something like this,” the system matches on the fingerprint. When you add “but only jazz from the 2020s,” it uses the metadata to filter.


2. Similarity search vs exact match

Relational databases find rows that match a condition exactly: WHERE country = 'CH'. Vector databases find records that are nearest to a query vector in high-dimensional space.2 The distance between two vectors is measured using metrics like:

  • Cosine similarity — measures the angle between two vectors (the most common for text embeddings)
  • Euclidean distance — measures straight-line distance between two points
  • Dot product — fast and effective when vectors are normalised

The result is a ranked list of records ordered by similarity, not a binary match/no-match answer.

Key distinction

Relational databases answer “does this exist?” with certainty. Vector databases answer “what is most similar?” with a ranked score. Neither is better — they solve different problems.


Finding the exact nearest neighbours to a query vector requires comparing the query against every stored vector — this is computationally impractical at scale. If you have 10 million vectors with 1,536 dimensions each, an exhaustive search means billions of floating-point operations per query.3

Approximate nearest neighbour (ANN) algorithms solve this by trading a tiny amount of accuracy for massive gains in speed. Instead of comparing against every vector, they use clever indexing structures that narrow the search to a small subset of candidates. The result is typically 95-99% as accurate as an exhaustive search, at a fraction of the computation cost.3

The word “approximate” often worries newcomers, but in practice the trade-off is favourable: the results are nearly identical to a brute-force search, and the queries that would take seconds now take milliseconds.

Think of it like...

Looking up a word in a dictionary. You do not read every page from cover to cover. You jump to the right section based on the first letter, then narrow down. You might occasionally land one page off, but you find what you need almost instantly. ANN algorithms work the same way — they use structure to skip most of the data and zoom in on the relevant region.


4. Common index types — HNSW and IVF

Two indexing strategies dominate production vector databases:3

HNSW (Hierarchical Navigable Small World) builds a multi-layered graph where each vector is a node connected to its nearest neighbours. The top layers are sparse, allowing fast long-range jumps across the dataset. The bottom layers are dense, enabling precise local search. At query time, the algorithm starts at the top, hops through the graph toward the target region, then drills down for accuracy. HNSW delivers excellent recall and fast queries but uses significant memory because the entire graph must be held in RAM.3

IVF (Inverted File Index) divides the vector space into clusters (using a technique like k-means). Each cluster groups similar vectors together. At query time, the algorithm identifies the nearest clusters and searches only within those clusters, skipping the rest. IVF uses less memory than HNSW and works well when combined with compression techniques, but it requires a training step to build the clusters.3

Index typeStrengthsTrade-offs
HNSWFast queries, high recallMemory-intensive
IVFLower memory, good with compressionRequires cluster training, slightly lower recall

5. Hybrid search — combining vectors with metadata

Pure vector search returns the most semantically similar results globally. But in practice, you often want to combine meaning-based search with traditional filters: “find similar documents, but only from this user, created after this date.”3

Hybrid search combines dense vector similarity with structured attribute filtering. Some systems also combine dense vectors with sparse retrieval methods like BM25 (traditional keyword matching) to get the best of both worlds — semantic understanding and keyword precision.3

For example: a query for “GPT-5 release date” might semantically drift toward general AI topics. Adding keyword matching ensures the specific phrase is weighted, while semantic search captures conceptually related results.

Concept to explore

See rag for how hybrid search fits into retrieval-augmented generation, the dominant pattern for grounding LLM responses in real data.


6. The role in RAG architectures

Vector databases are the retrieval backbone of Retrieval-Augmented Generation. In a RAG pipeline:4

  1. Documents are split into chunks and embedded into vectors
  2. The vectors are stored in a vector database alongside the source text
  3. When a user asks a question, the question is embedded into a query vector
  4. The vector database returns the most relevant document chunks
  5. Those chunks are fed to an LLM as context for generating the answer

Without vector databases, RAG cannot find semantically relevant documents at scale. They are what make it possible to ground LLM responses in real, specific data rather than relying solely on the model’s training knowledge.


Why do we use it?

Key reasons

1. Meaning-based retrieval. Vector databases let you search by what content means, not what words it contains. “Automobile repair” finds documents about “car maintenance” because the meanings are close, even though the words differ.1

2. Scale for AI workloads. Modern AI applications — chatbots, recommendation engines, content search — need to compare a query against millions of vectors in real time. ANN indexing makes this possible with sub-100ms latency at production scale.3

3. Foundation for RAG. Retrieval-Augmented Generation depends on vector databases to find relevant context for LLM responses. Without them, RAG systems would need to fall back on keyword search, missing semantically relevant content.4

4. Multimodal flexibility. Vector databases store any kind of embedding — text, images, audio, code. The same database can power text search, image similarity, and cross-modal retrieval (searching images with text queries).2


When do we use it?

  • When building semantic search that finds results by meaning, not keywords
  • When implementing RAG to ground LLM responses in a knowledge base
  • When building recommendation systems (find products, articles, or content similar to what a user liked)
  • When you need to deduplicate or cluster unstructured content (similar support tickets, near-duplicate documents)
  • When building multimodal search (searching images by text description, or audio by mood)

Rule of thumb

If your users would describe what they want in natural language rather than typing exact keywords, a vector database is likely part of the solution.


How can I think about it?

The neighbourhood map

A vector database is like a city where every building is placed based on what it does, not its street address.

  • Every document gets coordinates on this map (its vector) based on its meaning
  • Restaurants cluster together in one district; hospitals cluster in another
  • Searching means dropping a pin on the map (“I need something like this”) and finding the nearest buildings
  • Approximate search means you check the buildings in your neighbourhood, not every building in the city — you might miss one on the edge, but you find the best matches quickly
  • Metadata filtering is like searching only within a specific district (“restaurants, but only Italian ones open after 8 PM”)
  • A relational database, by contrast, is like a phone book — perfect if you know the exact name, useless if you only know roughly what you are looking for

The wine sommelier

A vector database is like a sommelier who organises a wine cellar by flavour profile rather than alphabetically.

  • Each wine’s position in the cellar is determined by its taste characteristics — body, acidity, sweetness, tannins — encoded as a vector
  • Nearby wines taste similar: a Barolo and a Nebbiolo sit on adjacent shelves
  • Asking for a wine “like this one” means the sommelier walks to that shelf and pulls the neighbours — no need to describe the grape, region, or year
  • ANN search means the sommelier checks the most promising sections of the cellar first, rather than tasting every bottle
  • Hybrid search is when you add constraints: “like this one, but under 30 francs and from France” — combining flavour similarity with metadata
  • A regular database would be like organising the cellar alphabetically by label — useful if you know the name, but unhelpful for discovering wines you would enjoy

Concepts to explore next

ConceptWhat it coversStatus
ragUsing vector retrieval to ground LLM responses in real documentsstub
knowledge-graphsStructured representations of relationships, complementary to vector similaritystub
embeddingsThe numerical meaning representations that vector databases store and querycomplete

Some cards don't exist yet

A broken link is a placeholder for future learning, not an error.


Check your understanding


Where this concept fits

Position in the knowledge graph

graph TD
    CSM[Client-Server Model] --> DB[Databases]
    DB --> RDB[Relational Databases]
    DB --> DDB[Document Databases]
    DB --> VDB[Vector Databases]
    DB --> SQL[SQL]
    DB --> SCH[Schemas]
    MRF[Machine-Readable Formats] --> EMB[Embeddings]
    EMB -.->|prerequisite| VDB
    style VDB fill:#4a9ede,color:#fff

Related concepts:

  • rag — vector databases provide the retrieval layer that powers RAG, finding semantically relevant documents to feed to an LLM
  • knowledge-graphs — represent explicit relationships between entities, complementing the implicit similarity relationships captured by vector embeddings
  • machine-readable-formats — vector databases store embeddings, a specific type of machine-readable format optimised for meaning rather than structure

Sources


Further reading

Resources

Footnotes

  1. Abstract Algorithms. (2026). A Beginner’s Guide to Vector Database Principles. Abstract Algorithms. 2 3 4

  2. Instaclustr. (2025). Vector Database vs. Relational Database: 7 Key Differences. Instaclustr. 2 3

  3. Priya C, B. (2026). Vector Databases Explained in 3 Levels of Difficulty. Machine Learning Mastery. 2 3 4 5 6 7 8

  4. Raizada, S. (2026). Vector DB vs Traditional Databases: Embeddings Guide. Blockchain Council. 2