What Is Chain of Thought?
Chain of Thought (CoT) is a prompting technique where a language model is instructed — or implicitly encouraged — to produce intermediate reasoning steps before arriving at its final answer. Instead of jumping directly to a conclusion, the model "thinks aloud," working through the logic step by step.
The technique was first documented in a 2022 Google Research paper and has since become one of the most widely adopted methods for improving LLM reasoning on complex tasks.
Why It Works
LLMs generate text token by token. When forced to write out reasoning steps, the model effectively allocates more of its computation to the problem before committing to an answer. This has two benefits:
- Accuracy — Intermediate steps catch arithmetic errors, logical gaps, and incorrect assumptions.
- Interpretability — Humans can follow the reasoning and identify exactly where the model went wrong if it does.
Without chain of thought, a model asked a multi-step math problem often jumps to a wrong answer confidently. With CoT, it works through each step and arrives at the correct answer far more reliably.
How to Elicit Chain of Thought
There are two primary methods:
Zero-Shot CoT
Simply append "Let's think step by step" (or similar) to your prompt. The model interprets this as an instruction to reason before answering.
Q: If a store has 120 apples and sells 45, then receives a shipment of 60, how many apples does it have?
Let's think step by step.
Few-Shot CoT
Provide example question-and-reasoning-answer pairs in the prompt. The model learns the desired reasoning format from the examples and applies it to new questions.
Chain of Thought in Agent Contexts
CoT is a foundational building block for AI agents. Frameworks like ReAct extend CoT by alternating reasoning steps with real-world actions. In an agent loop, the model's "Thought" phase is essentially a CoT step — the model reasons about what it knows and what it should do next before taking action.
When an agent is processing extracted content from KnowledgeSDK's /v1/extract endpoint, CoT helps the model:
- Determine which fields are relevant to the user's question.
- Reason about whether the extracted data is sufficient or whether another extraction is needed.
- Synthesize information from multiple sources into a coherent answer.
Limitations
Chain of thought is not a silver bullet:
- Longer outputs — CoT increases token usage and latency.
- Plausible but wrong reasoning — The model can produce a coherent-sounding reasoning chain that leads to an incorrect conclusion.
- Not always necessary — For simple lookup tasks, CoT adds overhead without benefit.
Use CoT selectively — it is most valuable for tasks requiring multi-step logic, math, planning, or synthesis of complex information.
Variants
- Self-Consistency CoT — Generate multiple reasoning chains and take the most common answer.
- Tree of Thoughts — Explore multiple reasoning branches in parallel rather than a single chain.
- Program of Thought — Express reasoning as executable code rather than natural language.