Agentic RAG

A RAG architecture where an AI agent autonomously decides when and how to retrieve information, often across multiple retrieval steps.

What Is Agentic RAG?

Agentic RAG combines the information-retrieval power of Retrieval-Augmented Generation (RAG) with the autonomous decision-making of AI agents. In traditional RAG, retrieval happens once — a query is embedded, similar documents are fetched, and the LLM answers based on those documents. In agentic RAG, the agent decides dynamically when to retrieve, what to retrieve, how many retrieval steps to perform, and whether the retrieved results are sufficient.

The result is a system that can handle research tasks far more complex than any single-step lookup.

Traditional RAG vs. Agentic RAG

Aspect	Traditional RAG	Agentic RAG
Retrieval timing	Fixed — always once per query	Dynamic — agent decides when
Number of retrievals	One	Many, iterative
Query formulation	User's original query	Agent-generated sub-queries
Sufficiency check	None	Agent evaluates and re-retrieves if needed
Data sources	Usually one index	Multiple sources, chosen by agent

How Agentic RAG Works

A typical agentic RAG loop:

Understand the goal — The agent receives a complex question and breaks it into sub-questions.
Formulate a retrieval query — The agent generates a targeted search query optimized for the knowledge base.
Retrieve — A vector search or keyword search returns candidate documents.
Evaluate — The agent assesses whether the retrieved content answers the sub-question. If not, it reformulates and retrieves again.
Synthesize — Once all sub-questions have been answered, the agent combines the findings into a final response.

A Concrete Example

Suppose a user asks: "How does Acme Corp's pricing compare to its three main competitors?"

An agentic RAG system might:

Retrieve Acme's pricing page from the knowledge base.
Identify that competitor information is missing and call KnowledgeSDK's /v1/search to find competitor documents already indexed.
Determine that one competitor's data is stale and call /v1/extract to refresh it from the live website.
Retrieve the updated competitor data.
Synthesize a comparison table.

Each retrieval step was driven by the agent's own assessment of what was still missing.

Why Agentic RAG Outperforms Standard RAG

Handles ambiguous queries — The agent can clarify and decompose the question rather than guess at a single retrieval.
Handles knowledge gaps — If the index does not have the answer, the agent can fetch it from the live web.
Reduces hallucination — Multiple targeted retrievals provide more grounded context than a single broad one.
Adapts to complexity — Simple questions get single retrievals; complex questions get multi-step research automatically.

Building Agentic RAG with KnowledgeSDK

KnowledgeSDK is well-suited as the retrieval layer for agentic RAG systems. Use /v1/search for semantic search over your indexed knowledge, and /v1/extract as the fallback for live retrieval when indexed content is insufficient. The agent decides when each endpoint is appropriate — you just provide the capabilities.