What Is HNSW?
HNSW (Hierarchical Navigable Small World) is a graph-based algorithm for approximate nearest-neighbor (ANN) search. It is the most widely used vector index algorithm in production systems, powering the similarity search layer in Pinecone, Weaviate, Qdrant, pgvector, and Typesense.
HNSW achieves sub-linear query time (effectively O(log n)) with high recall, making it possible to search hundreds of millions of vectors in milliseconds.
The Intuition: Small World Graphs
HNSW builds on the "small world" phenomenon — in a well-connected graph, any node can be reached from any other node in a small number of hops. HNSW exploits this to navigate a graph of vectors efficiently: starting far from the query in embedding space, each hop moves closer until the approximate nearest neighbors are found.
The Hierarchical Structure
HNSW organizes vectors into a multi-layer graph:
Layer 2 (sparse): a few long-range "highway" connections
Layer 1 (medium): more connections, shorter range
Layer 0 (dense): every vector, with many short-range connections
- Higher layers — fewer nodes, long-range edges. Used for fast coarse navigation.
- Layer 0 — all vectors with dense local connections. Used for fine-grained search.
Search starts at the top layer with a single entry point, greedily navigates to the layer-0 region nearest to the query, then performs a local beam search at layer 0 to find the top-k candidates.
Key Parameters
| Parameter | Description | Typical Value |
|---|---|---|
M |
Max edges per node per layer | 16–64 |
efConstruction |
Beam width during index build | 100–500 |
ef |
Beam width during search (runtime) | 50–200 |
- Higher M → better recall, more memory, slower build
- Higher ef → better recall, slower queries
- Higher efConstruction → better index quality, slower build (one-time cost)
HNSW Performance Characteristics
- Query latency — typically 1–10ms for millions of vectors
- Recall at k=10 — typically 95–99% with default parameters
- Memory — approximately 4 bytes × dimensions × number of vectors (for float32), plus graph overhead (~100 bytes per node for M=16)
- Build time — O(n log n) for n vectors
HNSW vs Other ANN Algorithms
| Algorithm | Type | Strengths | Weaknesses |
|---|---|---|---|
| HNSW | Graph | Best recall/speed trade-off, incremental inserts | High memory |
| IVF | Quantization | Lower memory | Needs training, slower updates |
| Flat | Brute force | Perfect recall | O(n) query time |
| ScaNN | Hybrid | High throughput at scale | Complex to configure |
HNSW is preferred when:
- You need high recall (>95%)
- Insertions happen frequently (HNSW is dynamic; IVF requires re-training)
- Latency is more important than memory cost
Deletions and Updates
HNSW does not support true in-place deletion efficiently. Most implementations use soft deletes (flagging vectors as deleted) and periodic index compaction. If your knowledge base changes frequently, choose a vector database that handles this gracefully (e.g., Qdrant's HNSW implementation with collection snapshots).
HNSW in KnowledgeSDK
When POST /v1/extract indexes a new document, KnowledgeSDK inserts the resulting chunk vectors into a Typesense collection backed by an HNSW index. The index is updated incrementally with each new chunk — there is no batch rebuild step. Queries via POST /v1/search traverse the HNSW graph to find the nearest chunk vectors in real time.