Hybrid Search

A retrieval strategy that combines dense vector search with sparse keyword search (like BM25) to improve recall and precision.

What Is Hybrid Search?

Hybrid search combines two complementary retrieval methods — dense vector search (semantic) and sparse keyword search (lexical) — into a single ranked result list. By running both in parallel and merging their results, hybrid search achieves higher recall and precision than either method alone.

It is the recommended retrieval strategy for production RAG pipelines.

Why Hybrid?

Dense and sparse retrieval have complementary failure modes:

Failure mode	Dense (vector)	Sparse (keyword)
Synonym handling	Excellent	Poor
Exact term matching	Poor	Excellent
Rare proper nouns	Poor	Excellent
Short queries	Noisy	Reliable
Conceptual similarity	Excellent	Poor

Hybrid search covers both bases simultaneously.

How Hybrid Search Works

Dense retrieval — embed the query, run ANN search, get top-K vectors with scores
Sparse retrieval — run BM25 (or TF-IDF) against an inverted index, get top-K documents with scores
Score fusion — merge and re-rank the two result lists into a unified ranking

Reciprocal Rank Fusion (RRF)

The most common fusion method. For each document, sum the reciprocal of its rank in each result list:

RRF_score(doc) = Σ 1 / (k + rank_in_list_i)

where k is a smoothing constant (commonly 60). RRF is robust and does not require score normalization across systems.

Weighted Score Fusion

Alternatively, normalize scores from each system to [0, 1] and combine with a tunable weight α:

final_score = α × vector_score + (1 - α) × keyword_score

A value of α = 0.7 (favoring semantic) is a common starting point.

Implementation Example

# Pseudocode: hybrid search with RRF
dense_results = vector_db.search(query_vector, top_k=20)
sparse_results = bm25_index.search(query_text, top_k=20)

rrf_scores = defaultdict(float)
k = 60

for rank, doc in enumerate(dense_results):
    rrf_scores[doc.id] += 1 / (k + rank + 1)

for rank, doc in enumerate(sparse_results):
    rrf_scores[doc.id] += 1 / (k + rank + 1)

final_results = sorted(rrf_scores.items(), key=lambda x: -x[1])[:top_k]

Hybrid Search in Typesense

Typesense (the search engine powering KnowledgeSDK) has native hybrid search support. It runs vector and BM25 retrieval in a single query and applies RRF internally:

{
  "q": "cancel subscription",
  "query_by": "embedding,content",
  "vector_query": "embedding:([], alpha:0.7)"
}

KnowledgeSDK and Hybrid Search

POST /v1/search automatically uses hybrid retrieval. You send a plain text query; KnowledgeSDK handles embedding generation, parallel retrieval, and score fusion:

curl -X POST https://api.knowledgesdk.com/v1/search \
  -H "x-api-key: knowledgesdk_live_..." \
  -d '{"query": "SKU-4829-X return policy"}'

This query benefits from both paths: the semantic path handles "return policy" intent, while the keyword path reliably pins the exact SKU.

When to Use Pure Semantic vs Hybrid

Pure semantic — prototype environments, general Q&A over natural-language content
Hybrid — production RAG, technical documentation with product codes, multi-lingual corpora, high-stakes applications where recall matters