What Is Hybrid Search?
Hybrid search combines two complementary retrieval methods — dense vector search (semantic) and sparse keyword search (lexical) — into a single ranked result list. By running both in parallel and merging their results, hybrid search achieves higher recall and precision than either method alone.
It is the recommended retrieval strategy for production RAG pipelines.
Why Hybrid?
Dense and sparse retrieval have complementary failure modes:
| Failure mode | Dense (vector) | Sparse (keyword) |
|---|---|---|
| Synonym handling | Excellent | Poor |
| Exact term matching | Poor | Excellent |
| Rare proper nouns | Poor | Excellent |
| Short queries | Noisy | Reliable |
| Conceptual similarity | Excellent | Poor |
Hybrid search covers both bases simultaneously.
How Hybrid Search Works
- Dense retrieval — embed the query, run ANN search, get top-K vectors with scores
- Sparse retrieval — run BM25 (or TF-IDF) against an inverted index, get top-K documents with scores
- Score fusion — merge and re-rank the two result lists into a unified ranking
Reciprocal Rank Fusion (RRF)
The most common fusion method. For each document, sum the reciprocal of its rank in each result list:
RRF_score(doc) = Σ 1 / (k + rank_in_list_i)
where k is a smoothing constant (commonly 60). RRF is robust and does not require score normalization across systems.
Weighted Score Fusion
Alternatively, normalize scores from each system to [0, 1] and combine with a tunable weight α:
final_score = α × vector_score + (1 - α) × keyword_score
A value of α = 0.7 (favoring semantic) is a common starting point.
Implementation Example
# Pseudocode: hybrid search with RRF
dense_results = vector_db.search(query_vector, top_k=20)
sparse_results = bm25_index.search(query_text, top_k=20)
rrf_scores = defaultdict(float)
k = 60
for rank, doc in enumerate(dense_results):
rrf_scores[doc.id] += 1 / (k + rank + 1)
for rank, doc in enumerate(sparse_results):
rrf_scores[doc.id] += 1 / (k + rank + 1)
final_results = sorted(rrf_scores.items(), key=lambda x: -x[1])[:top_k]
Hybrid Search in Typesense
Typesense (the search engine powering KnowledgeSDK) has native hybrid search support. It runs vector and BM25 retrieval in a single query and applies RRF internally:
{
"q": "cancel subscription",
"query_by": "embedding,content",
"vector_query": "embedding:([], alpha:0.7)"
}
KnowledgeSDK and Hybrid Search
POST /v1/search automatically uses hybrid retrieval. You send a plain text query; KnowledgeSDK handles embedding generation, parallel retrieval, and score fusion:
curl -X POST https://api.knowledgesdk.com/v1/search \
-H "x-api-key: knowledgesdk_live_..." \
-d '{"query": "SKU-4829-X return policy"}'
This query benefits from both paths: the semantic path handles "return policy" intent, while the keyword path reliably pins the exact SKU.
When to Use Pure Semantic vs Hybrid
- Pure semantic — prototype environments, general Q&A over natural-language content
- Hybrid — production RAG, technical documentation with product codes, multi-lingual corpora, high-stakes applications where recall matters