Vector Databases
Databases designed to store, index, and query high-dimensional vectors (embeddings) so you can search by meaning rather than by exact match.
What is it?
A traditional database answers the question: “does this row match these exact criteria?” A vector database answers a fundamentally different question: “which records are most similar to this?”1 This distinction matters because a growing class of data — documents, images, audio, user behaviour — cannot be searched effectively by exact match. The right query is not “find this,” but “find what is close to this.”
Vector databases store embeddings — lists of numbers that encode the meaning of content — alongside optional metadata. When you search a vector database, you provide a query vector (the embedding of your question or input), and the database returns the stored vectors that are closest to it in high-dimensional space. “Close” means “similar in meaning.” A search for “how do I recover my account” would return results about “password reset” even though the two phrases share no words, because their embeddings sit near each other in vector space.1
The parent concept, databases, covers how data is stored and queried for traditional use cases — exact lookups, filtering, aggregation. Vector databases extend this idea to a domain where exact lookup is the wrong tool. Where a relational database uses a B-tree index to find rows matching a condition, a vector database uses specialised index structures to find vectors that are geometrically close to a query vector.2 The core trade-off is precision: relational databases give you exact answers; vector databases give you approximate answers ranked by similarity, which is exactly what you want when dealing with meaning.
In plain terms
A vector database is like a librarian who understands what you mean, not just what you say. If you ask a regular librarian for “books about fixing broken glass,” they search the card catalogue for those exact words. A vector-database librarian understands you probably want the book titled “Window Repair Guide” even though it does not contain the word “broken” — because it is about the same thing.
At a glance
How vector search works (click to expand)
graph LR Q[Query Text] --> EM[Embedding Model] EM --> QV[Query Vector] QV --> IDX{ANN Index} IDX --> R1[Result 1 - similarity 0.94] IDX --> R2[Result 2 - similarity 0.89] IDX --> R3[Result 3 - similarity 0.85] D[Documents] --> EM2[Embedding Model] EM2 --> SV[Stored Vectors + Metadata] SV --> IDXKey: Documents are converted to vectors by an embedding model and stored in the database. At query time, the query is also converted to a vector, and the ANN (approximate nearest neighbour) index finds the stored vectors closest to it. Results are ranked by similarity score.
How does it work?
1. Storing vectors with metadata
Every record in a vector database has three parts: the vector itself (the embedding), a unique identifier, and optional metadata (structured fields like author, date, category, or source URL).1
For example, if you embed a collection of support articles, each record might look like this:
| Part | Example |
|---|---|
| ID | article-0042 |
| Vector | [0.12, -0.33, 0.91, ... ] (1,536 dimensions) |
| Metadata | { "category": "billing", "language": "en", "updated": "2025-11-01" } |
The metadata is important because it enables filtered search — combining semantic similarity with traditional attribute matching.
Think of it like...
A music streaming service. Each song has an audio fingerprint (the vector) that captures its “mood,” plus metadata tags like genre, tempo, and release year. When you say “play something like this,” the system matches on the fingerprint. When you add “but only jazz from the 2020s,” it uses the metadata to filter.
2. Similarity search vs exact match
Relational databases find rows that match a condition exactly: WHERE country = 'CH'. Vector databases find records that are nearest to a query vector in high-dimensional space.2 The distance between two vectors is measured using metrics like:
- Cosine similarity — measures the angle between two vectors (the most common for text embeddings)
- Euclidean distance — measures straight-line distance between two points
- Dot product — fast and effective when vectors are normalised
The result is a ranked list of records ordered by similarity, not a binary match/no-match answer.
Key distinction
Relational databases answer “does this exist?” with certainty. Vector databases answer “what is most similar?” with a ranked score. Neither is better — they solve different problems.
3. Approximate nearest neighbour (ANN) search
Finding the exact nearest neighbours to a query vector requires comparing the query against every stored vector — this is computationally impractical at scale. If you have 10 million vectors with 1,536 dimensions each, an exhaustive search means billions of floating-point operations per query.3
Approximate nearest neighbour (ANN) algorithms solve this by trading a tiny amount of accuracy for massive gains in speed. Instead of comparing against every vector, they use clever indexing structures that narrow the search to a small subset of candidates. The result is typically 95-99% as accurate as an exhaustive search, at a fraction of the computation cost.3
The word “approximate” often worries newcomers, but in practice the trade-off is favourable: the results are nearly identical to a brute-force search, and the queries that would take seconds now take milliseconds.
Think of it like...
Looking up a word in a dictionary. You do not read every page from cover to cover. You jump to the right section based on the first letter, then narrow down. You might occasionally land one page off, but you find what you need almost instantly. ANN algorithms work the same way — they use structure to skip most of the data and zoom in on the relevant region.
4. Common index types — HNSW and IVF
Two indexing strategies dominate production vector databases:3
HNSW (Hierarchical Navigable Small World) builds a multi-layered graph where each vector is a node connected to its nearest neighbours. The top layers are sparse, allowing fast long-range jumps across the dataset. The bottom layers are dense, enabling precise local search. At query time, the algorithm starts at the top, hops through the graph toward the target region, then drills down for accuracy. HNSW delivers excellent recall and fast queries but uses significant memory because the entire graph must be held in RAM.3
IVF (Inverted File Index) divides the vector space into clusters (using a technique like k-means). Each cluster groups similar vectors together. At query time, the algorithm identifies the nearest clusters and searches only within those clusters, skipping the rest. IVF uses less memory than HNSW and works well when combined with compression techniques, but it requires a training step to build the clusters.3
| Index type | Strengths | Trade-offs |
|---|---|---|
| HNSW | Fast queries, high recall | Memory-intensive |
| IVF | Lower memory, good with compression | Requires cluster training, slightly lower recall |
Example: choosing an index strategy (click to expand)
Consider a RAG system for a knowledge base with 500,000 documents:
- HNSW would be the default choice if the server has enough RAM (roughly 3-6 GB for 500K vectors at 1,536 dimensions). Queries would return in single-digit milliseconds with 97%+ recall.
- IVF would be preferred if memory is constrained or the dataset is expected to grow to tens of millions of vectors, especially when combined with product quantisation (PQ) to compress the vectors.
Most managed vector databases (Pinecone, Weaviate) handle this decision for you. If you are self-hosting with a library like Faiss, you choose and tune the index explicitly.
5. Hybrid search — combining vectors with metadata
Pure vector search returns the most semantically similar results globally. But in practice, you often want to combine meaning-based search with traditional filters: “find similar documents, but only from this user, created after this date.”3
Hybrid search combines dense vector similarity with structured attribute filtering. Some systems also combine dense vectors with sparse retrieval methods like BM25 (traditional keyword matching) to get the best of both worlds — semantic understanding and keyword precision.3
For example: a query for “GPT-5 release date” might semantically drift toward general AI topics. Adding keyword matching ensures the specific phrase is weighted, while semantic search captures conceptually related results.
Concept to explore
See rag for how hybrid search fits into retrieval-augmented generation, the dominant pattern for grounding LLM responses in real data.
6. The role in RAG architectures
Vector databases are the retrieval backbone of Retrieval-Augmented Generation. In a RAG pipeline:4
- Documents are split into chunks and embedded into vectors
- The vectors are stored in a vector database alongside the source text
- When a user asks a question, the question is embedded into a query vector
- The vector database returns the most relevant document chunks
- Those chunks are fed to an LLM as context for generating the answer
Without vector databases, RAG cannot find semantically relevant documents at scale. They are what make it possible to ground LLM responses in real, specific data rather than relying solely on the model’s training knowledge.
Why do we use it?
Key reasons
1. Meaning-based retrieval. Vector databases let you search by what content means, not what words it contains. “Automobile repair” finds documents about “car maintenance” because the meanings are close, even though the words differ.1
2. Scale for AI workloads. Modern AI applications — chatbots, recommendation engines, content search — need to compare a query against millions of vectors in real time. ANN indexing makes this possible with sub-100ms latency at production scale.3
3. Foundation for RAG. Retrieval-Augmented Generation depends on vector databases to find relevant context for LLM responses. Without them, RAG systems would need to fall back on keyword search, missing semantically relevant content.4
4. Multimodal flexibility. Vector databases store any kind of embedding — text, images, audio, code. The same database can power text search, image similarity, and cross-modal retrieval (searching images with text queries).2
When do we use it?
- When building semantic search that finds results by meaning, not keywords
- When implementing RAG to ground LLM responses in a knowledge base
- When building recommendation systems (find products, articles, or content similar to what a user liked)
- When you need to deduplicate or cluster unstructured content (similar support tickets, near-duplicate documents)
- When building multimodal search (searching images by text description, or audio by mood)
Rule of thumb
If your users would describe what they want in natural language rather than typing exact keywords, a vector database is likely part of the solution.
How can I think about it?
The neighbourhood map
A vector database is like a city where every building is placed based on what it does, not its street address.
- Every document gets coordinates on this map (its vector) based on its meaning
- Restaurants cluster together in one district; hospitals cluster in another
- Searching means dropping a pin on the map (“I need something like this”) and finding the nearest buildings
- Approximate search means you check the buildings in your neighbourhood, not every building in the city — you might miss one on the edge, but you find the best matches quickly
- Metadata filtering is like searching only within a specific district (“restaurants, but only Italian ones open after 8 PM”)
- A relational database, by contrast, is like a phone book — perfect if you know the exact name, useless if you only know roughly what you are looking for
The wine sommelier
A vector database is like a sommelier who organises a wine cellar by flavour profile rather than alphabetically.
- Each wine’s position in the cellar is determined by its taste characteristics — body, acidity, sweetness, tannins — encoded as a vector
- Nearby wines taste similar: a Barolo and a Nebbiolo sit on adjacent shelves
- Asking for a wine “like this one” means the sommelier walks to that shelf and pulls the neighbours — no need to describe the grape, region, or year
- ANN search means the sommelier checks the most promising sections of the cellar first, rather than tasting every bottle
- Hybrid search is when you add constraints: “like this one, but under 30 francs and from France” — combining flavour similarity with metadata
- A regular database would be like organising the cellar alphabetically by label — useful if you know the name, but unhelpful for discovering wines you would enjoy
Concepts to explore next
| Concept | What it covers | Status |
|---|---|---|
| rag | Using vector retrieval to ground LLM responses in real documents | stub |
| knowledge-graphs | Structured representations of relationships, complementary to vector similarity | stub |
| embeddings | The numerical meaning representations that vector databases store and query | complete |
Some cards don't exist yet
A broken link is a placeholder for future learning, not an error.
Check your understanding
Test yourself (click to expand)
- Explain the fundamental difference between how a relational database and a vector database answer a query. Why does this difference matter for AI applications?
- Name the three parts of a record in a vector database and describe the role of each.
- Distinguish between HNSW and IVF indexing. What are the key trade-offs, and when might you choose one over the other?
- Interpret this scenario: a semantic search for “how to fix a broken window” returns a document titled “Glass Pane Replacement Guide” with a similarity score of 0.92, but misses a document titled “Window Settings in Operating Systems.” Why does this happen, and why is it desirable?
- Connect vector databases to RAG. What specific role does the vector database play in a RAG pipeline, and what would happen if you replaced it with a traditional keyword search?
Where this concept fits
Position in the knowledge graph
graph TD CSM[Client-Server Model] --> DB[Databases] DB --> RDB[Relational Databases] DB --> DDB[Document Databases] DB --> VDB[Vector Databases] DB --> SQL[SQL] DB --> SCH[Schemas] MRF[Machine-Readable Formats] --> EMB[Embeddings] EMB -.->|prerequisite| VDB style VDB fill:#4a9ede,color:#fffRelated concepts:
- rag — vector databases provide the retrieval layer that powers RAG, finding semantically relevant documents to feed to an LLM
- knowledge-graphs — represent explicit relationships between entities, complementing the implicit similarity relationships captured by vector embeddings
- machine-readable-formats — vector databases store embeddings, a specific type of machine-readable format optimised for meaning rather than structure
Sources
Further reading
Resources
- A Beginner’s Guide to Vector Database Principles (Abstract Algorithms) — Clear introduction covering similarity search, index types, and hybrid retrieval with practical examples
- Vector Databases Explained in 3 Levels of Difficulty (Machine Learning Mastery) — Progressive explanation from basic similarity search through ANN indexing algorithms and production architecture
- Vector Database vs Traditional Database (Redis) — Concise comparison showing when to use vector databases versus relational databases
- Vector DB vs Traditional Databases: Embeddings Guide (Blockchain Council) — Detailed guide covering embeddings, HNSW, IVF, and hybrid search patterns
- Vector Databases Compared: pgvector vs Pinecone vs Weaviate (BackendBytes) — Practical comparison of popular vector database options for production RAG systems
Footnotes
-
Abstract Algorithms. (2026). A Beginner’s Guide to Vector Database Principles. Abstract Algorithms. ↩ ↩2 ↩3 ↩4
-
Instaclustr. (2025). Vector Database vs. Relational Database: 7 Key Differences. Instaclustr. ↩ ↩2 ↩3
-
Priya C, B. (2026). Vector Databases Explained in 3 Levels of Difficulty. Machine Learning Mastery. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8
-
Raizada, S. (2026). Vector DB vs Traditional Databases: Embeddings Guide. Blockchain Council. ↩ ↩2