knowledgesdk.com/glossary/sparse-retrieval
RAG & Retrievalintermediate

Also known as: keyword retrieval, sparse vector search

Sparse Retrieval

A retrieval method that represents documents as sparse term-frequency vectors, enabling fast keyword-based matching.

What Is Sparse Retrieval?

Sparse retrieval is a class of information retrieval methods that represent documents as high-dimensional vectors where most values are zero. Each dimension corresponds to a term in the vocabulary, and the non-zero values represent that term's importance in the document.

The "sparse" name reflects the vector structure: a vocabulary of 100,000 terms produces 100,000-dimensional vectors, but any given document contains only a few hundred distinct terms — so 99%+ of values are zero.

The Inverted Index

Sparse retrieval is implemented via an inverted index: a mapping from each term to the list of documents containing it (the posting list), along with term frequency information.

"refund"  → [(doc_3, tf=2), (doc_7, tf=1), (doc_12, tf=4)]
"cancel"  → [(doc_1, tf=1), (doc_3, tf=3), (doc_9, tf=2)]
"policy"  → [(doc_3, tf=1), (doc_5, tf=2)]

At query time, the query terms are looked up in the index, and their posting lists are intersected or merged to produce a ranked result set. This operation is extremely fast — milliseconds even for billions of documents.

TF-IDF: The Foundation

Term Frequency–Inverse Document Frequency (TF-IDF) is the classic sparse scoring function:

TF-IDF(term, doc) = TF(term, doc) × IDF(term)

TF = frequency of term in doc / total terms in doc
IDF = log(total docs / docs containing term)

A term that appears often in a document but rarely in the corpus gets a high score — capturing the idea of a "discriminative" term.

BM25: The Modern Standard

BM25 (Best Match 25) is the dominant sparse retrieval algorithm today. It improves on TF-IDF by adding:

  • Term frequency saturation — repeated occurrences have diminishing returns
  • Document length normalization — penalizes longer documents to avoid length bias

BM25 is the default ranking function in Elasticsearch, OpenSearch, and Solr.

Sparse vs Dense: Trade-offs

Sparse Dense
Vocabulary handling Exact match only Semantic similarity
Latency Very fast (ms) Fast (5–20ms with ANN)
Infrastructure Inverted index Vector index (HNSW)
Handles synonyms No Yes
Handles rare terms Perfectly Poorly
No training required Yes Requires embedding model

SPLADE: Learned Sparse Retrieval

Recent work like SPLADE (SParse Lexical AnD Expansion) uses a neural network to learn which terms to expand a document with — keeping the sparse representation but improving recall by adding semantically related terms to the index. It bridges dense and sparse retrieval.

Why Sparse Retrieval Still Matters

Despite the rise of vector databases, sparse retrieval remains essential:

  • Exact term matching is reliable — product IDs, error codes, names, and rare jargon are retrieved precisely
  • No embedding model dependency — works without GPU-based inference
  • Explainability — it is easy to see which terms caused a document to rank
  • Speed — inverted index lookup is faster than most ANN implementations

Sparse Retrieval in KnowledgeSDK

KnowledgeSDK's POST /v1/search endpoint uses Typesense, which maintains an inverted BM25 index alongside the vector index. Sparse and dense retrieval are combined automatically using Reciprocal Rank Fusion, giving you the benefits of both approaches.

This means queries containing exact product codes or version strings will retrieve correctly even when semantic similarity would miss them.

Related Terms

RAG & Retrievalintermediate
BM25
A probabilistic ranking function used in information retrieval that scores documents based on term frequency and inverse document frequency.
RAG & Retrievalintermediate
Dense Retrieval
A retrieval method that represents both queries and documents as dense vectors and finds matches via nearest-neighbor search.
RAG & Retrievalintermediate
Hybrid Search
A retrieval strategy that combines dense vector search with sparse keyword search (like BM25) to improve recall and precision.
Sliding Window ChunkingStructured Data Extraction

Try it now

Build with Sparse Retrieval using one API.

Extract, index, and search any web content. First 1,000 requests free.

GET API KEY →
← Back to glossary