knowledgesdk.com/glossary/dense-retrieval
RAG & Retrievalintermediate

Also known as: dense passage retrieval, DPR

Dense Retrieval

A retrieval method that represents both queries and documents as dense vectors and finds matches via nearest-neighbor search.

What Is Dense Retrieval?

Dense retrieval is a family of information retrieval techniques where both queries and documents are encoded as dense, continuous vectors (embeddings) by a neural network. Retrieval is performed by finding the document vectors nearest to the query vector in the embedding space.

The term "dense" contrasts with "sparse" retrieval (like BM25), where documents are represented as high-dimensional but mostly-zero term-frequency vectors.

Origins: Dense Passage Retrieval (DPR)

Dense retrieval was popularized by the Dense Passage Retrieval (DPR) paper from Facebook AI (2020). DPR used two separate BERT encoders — one for queries, one for passages — trained on question-answer pairs so that relevant passages would land close to their questions in vector space.

This approach dramatically outperformed BM25 on open-domain question answering benchmarks.

How Dense Retrieval Works

Query: "what is the refund window?"
          ↓ Query Encoder (bi-encoder)
Query Vector: [0.23, -0.71, 0.44, ...]   (768 dims)

Indexed Passages:
  "You have 30 days to request a refund." → [0.25, -0.69, 0.41, ...]
  "Contact support for billing issues."   → [-0.12, 0.33, -0.55, ...]
          ↓ ANN Search
Top result: "You have 30 days to request a refund."

Bi-Encoder Architecture

Dense retrieval uses a bi-encoder (also called a dual encoder):

  • The query encoder and document encoder share weights or are trained jointly
  • Encoding is done independently, so document vectors can be pre-computed offline
  • At query time, only the query needs to be encoded (fast)
  • Similarity is computed as dot product or cosine between the two vectors

This offline pre-computation is what makes dense retrieval practical at scale.

Training Dense Retrievers

A dense retriever needs training (or fine-tuning) to be effective. Common training signals:

  • Positive pairs — (question, relevant passage) from QA datasets
  • Hard negatives — passages that look relevant but are not (improves discriminability)
  • In-batch negatives — other questions' passages in the same training batch serve as negatives (computationally efficient)

Pre-trained models like sentence-transformers/all-mpnet-base-v2 or OpenAI's embedding models can be used without task-specific fine-tuning for general-purpose retrieval.

Dense vs Sparse Retrieval

Dense Sparse
Representation Continuous float vector Sparse term-frequency vector
Index type HNSW / IVF Inverted index
Handles synonyms Yes No
Handles exact terms Poorly Perfectly
Requires training data Yes (for best results) No
Query latency ~5–20ms (ANN) ~1–5ms

Dense Retrieval Limitations

  • Vocabulary mismatch is solved but exact-term recall suffers — product codes, version numbers, and proper nouns may not retrieve well
  • Requires a good embedding model — quality degrades significantly with weak encoders
  • Index must be rebuilt when documents change — no dynamic term statistics like BM25

For these reasons, dense retrieval is almost always combined with sparse retrieval in production (see hybrid search).

Dense Retrieval with KnowledgeSDK

KnowledgeSDK uses dense retrieval as one component of its hybrid search pipeline. When you call POST /v1/extract, each chunk is encoded and stored in a Typesense vector field. The POST /v1/search endpoint automatically performs dense retrieval using the query embedding alongside BM25 scoring.

Related Terms

RAG & Retrievalbeginner
Embedding
A dense numerical vector representation of text, images, or other data that captures semantic meaning in a high-dimensional space.
RAG & Retrievalbeginner
Semantic Search
A search approach that finds results based on meaning and intent rather than exact keyword matching.
RAG & Retrievaladvanced
Approximate Nearest Neighbor
A class of algorithms that find vectors approximately closest to a query vector, trading perfect accuracy for massive speed gains.
RAG & Retrievalintermediate
Sparse Retrieval
A retrieval method that represents documents as sparse term-frequency vectors, enabling fast keyword-based matching.
Deep Research AgentDocument Store

Try it now

Build with Dense Retrieval using one API.

Extract, index, and search any web content. First 1,000 requests free.

GET API KEY →
← Back to glossary