What Is LLM Hallucination?
Hallucination refers to the phenomenon where a large language model generates text that is fluent and confident-sounding but factually wrong, internally contradictory, or entirely fabricated. The term borrows from neuroscience, where it describes perceptions without external stimuli.
Common examples include:
- Inventing citations for books or papers that do not exist.
- Fabricating statistics, dates, or product specifications.
- Describing API endpoints or function signatures that were never released.
- Confidently stating incorrect historical facts.
The alarming quality of hallucinations is how well they blend in: the model's tone carries the same confidence whether it is stating a verified fact or inventing one.
Why Do LLMs Hallucinate?
LLMs are trained to predict the most statistically likely next token, not to retrieve verified facts from a database. This has several consequences:
- Knowledge gaps — If the training data contained little or no information about a topic, the model interpolates or invents details rather than saying "I don't know."
- Training data noise — Incorrect information present in the training corpus gets encoded alongside correct information.
- Distributional pressure — The model is rewarded for fluency and coherence during training; factual accuracy is a harder signal to reinforce.
- Context drift — In long conversations, models can lose track of earlier context and contradict themselves.
Types of Hallucinations
| Type | Description | Example |
|---|---|---|
| Factual | Incorrect real-world facts | Wrong founding year of a company |
| Attribution | Fake citations or quotes | Inventing a paper by a real author |
| Logical | Internal self-contradiction | Claiming X then later claiming not-X |
| Entity | Wrong names, places, or entities | Assigning a quote to the wrong person |
Mitigating Hallucinations
Retrieval-Augmented Generation (RAG)
The most effective production mitigation. Supply the model with verified source documents at query time and instruct it to answer only from those sources.
Grounding Prompts
Explicitly instruct the model to say "I don't know" when uncertain:
Answer only using the provided context. If the answer is not in the context, say "I don't have enough information."
Citation Requirements
Force the model to cite specific passages from the source material, making fabrications easier to detect.
Guardrails and Output Validation
Use a second LLM pass or a rule-based checker to validate claims in the output against known facts or source documents.
Temperature Control
Lower temperature values reduce creativity — and hallucinations — in factual retrieval tasks.
KnowledgeSDK and Hallucination Reduction
A major source of hallucinations in production RAG systems is stale or missing knowledge. When models lack accurate context, they fill in the gaps.
KnowledgeSDK reduces this risk by giving your LLM fresh, structured knowledge extracted directly from live web sources. The /v1/extract endpoint retrieves and cleans page content, while /v1/search provides semantic search over your indexed knowledge base — so the model always receives grounded, relevant context rather than relying on parametric memory.
const results = await sdk.search({ query: "product pricing", topK: 5 });
const context = results.map(r => r.content).join("\n\n");
// Inject `context` into your RAG prompt to reduce hallucinations
Hallucination is not fully solved, but it is manageable with the right architecture.