Cosine Similarity

A metric that measures the angle between two vectors, commonly used to compare how semantically similar two embeddings are.

What Is Cosine Similarity?

Cosine similarity measures the cosine of the angle between two vectors in a multi-dimensional space. It is the most commonly used metric for comparing text embeddings in semantic search and RAG pipelines.

A cosine similarity of 1.0 means the vectors point in exactly the same direction (identical meaning). A value of 0.0 means they are orthogonal (unrelated). A value of -1.0 means they point in opposite directions (though in practice, text embeddings rarely produce negative similarities).

The Formula

cosine_similarity(A, B) = (A · B) / (||A|| × ||B||)

Where:

A · B is the dot product of vectors A and B
||A|| and ||B|| are the L2 norms (magnitudes) of the vectors

In Python:

import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Example
query_vec = embed("how do I cancel?")
doc_vec   = embed("steps to unsubscribe from the plan")

score = cosine_similarity(query_vec, doc_vec)
# score ≈ 0.89 — highly similar

Why Cosine Over Euclidean Distance?

Cosine similarity ignores vector magnitude and measures only direction. This matters because:

Two documents about the same topic but different lengths will have embeddings in the same direction but different magnitudes
Cosine similarity correctly identifies them as similar
Euclidean distance would score them as dissimilar because of the magnitude difference

For normalized vectors (unit length), cosine similarity and dot product are equivalent.

Typical Score Ranges for Text Embeddings

Score Range	Interpretation
0.90 – 1.00	Near-duplicate or paraphrase
0.75 – 0.90	Highly relevant, same topic
0.60 – 0.75	Related, some overlap
0.40 – 0.60	Weakly related
< 0.40	Likely unrelated

These thresholds vary by embedding model — always calibrate against your own data.

Cosine Distance

Cosine distance is defined as:

cosine_distance = 1 - cosine_similarity

It converts similarity (higher = better) into distance (lower = better), which some ANN libraries require. Be aware that many vector databases expose a parameter to choose between similarity and distance — make sure your configuration matches your expectations.

Normalization and Dot Product

If you normalize all vectors to unit length before storing them:

def normalize(v):
    return v / np.linalg.norm(v)

normalized = normalize(embed("some text"))

Then cosine similarity equals the plain dot product. This allows some vector databases to use SIMD-optimized dot product operations instead of the full cosine formula, improving query throughput significantly.

Cosine Similarity in KnowledgeSDK

When you call POST /v1/search, KnowledgeSDK embeds your query and computes cosine similarity between the query vector and all indexed chunk vectors via HNSW. The returned results include a relevance score that reflects this similarity. You can use the score to filter out low-confidence matches before passing chunks to your LLM:

{
  "results": [
    { "content": "...", "score": 0.92, "source": "https://docs.example.com/billing" },
    { "content": "...", "score": 0.78, "source": "https://docs.example.com/faq" }
  ]
}

A common practice is to discard any result below a threshold (e.g., 0.70) to avoid injecting irrelevant context into LLM prompts.

Related Terms

RAG & Retrievalbeginner

Embedding

A dense numerical vector representation of text, images, or other data that captures semantic meaning in a high-dimensional space.

RAG & Retrievalbeginner

Semantic Search

A search approach that finds results based on meaning and intent rather than exact keyword matching.

RAG & Retrievalintermediate

Dense Retrieval

A retrieval method that represents both queries and documents as dense vectors and finds matches via nearest-neighbor search.

← Context Window Deep Research Agent →

Try it now

Build with Cosine Similarity using one API.

Extract, index, and search any web content. First 1,000 requests free.

GET API KEY →

← Back to glossary