SERP API vs Content Scraping API: Two Different Tools for AI Agents
A common source of confusion in the AI agent ecosystem is the difference between a SERP API and a content scraping API. Developers frequently reach for one when they need the other, or try to use one tool to do the job of both. The result is either an agent that finds URLs but cannot read them, or one that reads pages but cannot find the right ones to begin with.
These are two fundamentally different tools that solve different parts of the same problem. Understanding when to use each — and how to chain them together — is one of the most useful architectural decisions you will make when building AI agents that need live web access.
What a SERP API Returns
SERP stands for Search Engine Results Page. A SERP API gives you programmatic access to the results you would see if you typed a query into Google, Bing, or another search engine. The response contains:
- A list of result objects, each with a title, URL, and snippet (the 1-2 sentence preview text shown in the search results)
- Metadata about the results (position, domain, result type)
- Sometimes structured features: knowledge panels, answer boxes, related questions, shopping results, local map packs
What a SERP API does not give you is the actual content of any of those pages. You get the title and a 100-150 character snippet — not the 2,000-word article the snippet was extracted from.
Popular SERP APIs include:
- SerpApi — supports Google, Bing, YouTube, DuckDuckGo
- Serper.dev — Google Search API, very fast
- Oxylabs SERP — large-scale enterprise search scraping
- Brave Search API — independent index, no Google dependency
- Tavily — search API built specifically for LLM agents (returns curated snippets with source URLs)
A typical SERP API response looks like:
{
"query": "how to configure nginx reverse proxy",
"results": [
{
"title": "How To Configure Nginx as a Reverse Proxy on Ubuntu 22.04",
"url": "https://www.digitalocean.com/community/tutorials/how-to-configure-nginx",
"snippet": "This tutorial explains how to set up Nginx as a reverse proxy for one or more backend servers, useful for load balancing and SSL termination...",
"position": 1
},
{
"title": "NGINX Reverse Proxy | NGINX Documentation",
"url": "https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/",
"snippet": "Proxying is typically used to distribute the load among several servers, seamlessly show content from different websites...",
"position": 2
}
]
}
Notice what is missing: the actual tutorial content, the code blocks, the step-by-step instructions. The snippet is not enough to answer a technical question.
What a Content Scraping API Returns
A content scraping API (also called a web extraction API or web reader API) takes a URL and returns the full content of that page — typically cleaned, converted to markdown, and ready for LLM consumption.
For the first result above, a content scraping API would return something like:
# How To Configure Nginx as a Reverse Proxy on Ubuntu 22.04
## Introduction
A reverse proxy is a server that sits in front of web servers and forwards
client requests to those web servers. Reverse proxies are typically implemented
to increase security, performance, and reliability.
## Prerequisites
- One Ubuntu 22.04 server with a non-root sudo user and a firewall configured
- Nginx installed on your server
...
## Step 1 — Installing Nginx
To install Nginx, run the following command:
```bash
sudo apt update
sudo apt install nginx
... [continues for 2,000+ words]
Popular content scraping APIs include:
- **KnowledgeSDK** — full extraction with structured data, semantic search, webhooks
- **Firecrawl** — markdown extraction with JS rendering
- **Jina Reader** — simple URL-to-markdown
- **Apify** — general-purpose scraping platform
- **Diffbot** — structured entity extraction
---
## The Right Architecture: Search First, Then Read
The correct mental model is a two-stage pipeline:
Stage 1: SERP API Query → [title, url, snippet] × N results Purpose: find the right pages
Stage 2: Content API URL → full cleaned markdown Purpose: read those pages
Stage 1 is cheap and fast. A SERP API call costs $0.001-$0.005 and returns in under a second. Use it to identify the 2-3 most relevant URLs for a query.
Stage 2 is more expensive and takes a bit longer. A content extraction call costs $0.002-$0.010 and takes 1-5 seconds. Use it selectively, only on the URLs that Stage 1 identified as relevant.
The key insight: **never send all N search results to the content API**. The top 2-3 results usually contain the answer. Extracting all 10 is 3-5x more expensive and fills the LLM's context window with noise.
---
## Implementation: Two-Stage Web Agent
### Node.js
```typescript
import KnowledgeSDK from "@knowledgesdk/node";
import { OpenAI } from "openai";
const ks = new KnowledgeSDK({ apiKey: process.env.KNOWLEDGESDK_API_KEY! });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
interface SearchResult {
title: string;
url: string;
snippet: string;
}
// Stage 1: Get relevant URLs from Serper
async function searchWeb(query: string, numResults = 5): Promise<SearchResult[]> {
const response = await fetch("https://google.serper.dev/search", {
method: "POST",
headers: {
"X-API-KEY": process.env.SERPER_API_KEY!,
"Content-Type": "application/json",
},
body: JSON.stringify({ q: query, num: numResults }),
});
const data = await response.json();
return data.organic.map((r: any) => ({
title: r.title,
url: r.link,
snippet: r.snippet,
}));
}
// Stage 2: Read full content from top results
async function readPages(urls: string[]): Promise<{ url: string; content: string }[]> {
const results = await Promise.all(
urls.map(async (url) => {
try {
const extracted = await ks.extract(url);
return { url, content: extracted.markdown };
} catch {
return { url, content: "" };
}
})
);
return results.filter((r) => r.content.length > 0);
}
// Full pipeline: search + read + answer
async function researchAgent(question: string): Promise<string> {
// Step 1: Search for relevant pages
console.log(`Searching: "${question}"`);
const results = await searchWeb(question, 5);
// Step 2: Use LLM to pick which URLs to actually read
// (saves cost by not reading all 5)
const selectionResponse = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: `Given a question and search results, select the 2-3 most relevant URLs to read in full.
Return a JSON array of URLs: {"urls": ["url1", "url2"]}`,
},
{
role: "user",
content: `Question: ${question}\n\nSearch results:\n${results
.map((r) => `- ${r.title}\n URL: ${r.url}\n Snippet: ${r.snippet}`)
.join("\n\n")}`,
},
],
response_format: { type: "json_object" },
});
const { urls: selectedUrls } = JSON.parse(
selectionResponse.choices[0].message.content!
);
// Step 3: Read the selected pages
console.log(`Reading ${selectedUrls.length} pages...`);
const pages = await readPages(selectedUrls);
// Step 4: Answer based on full content
const context = pages
.map((p) => `Source: ${p.url}\n\n${p.content.slice(0, 3000)}`)
.join("\n\n---\n\n");
const answerResponse = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "Answer the user's question based on the provided web content. Cite your sources.",
},
{
role: "user",
content: `Question: ${question}\n\nWeb content:\n${context}`,
},
],
});
return answerResponse.choices[0].message.content!;
}
// Usage
const answer = await researchAgent(
"What are the breaking changes in React 19 for server components?"
);
console.log(answer);
Python
import os
import json
import httpx
from openai import AsyncOpenAI
import knowledgesdk
ks = knowledgesdk.Client(api_key=os.environ["KNOWLEDGESDK_API_KEY"])
openai = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def search_web(query: str, num_results: int = 5) -> list[dict]:
"""Stage 1: Get URLs and snippets from Serper"""
async with httpx.AsyncClient() as client:
response = await client.post(
"https://google.serper.dev/search",
headers={"X-API-KEY": os.environ["SERPER_API_KEY"]},
json={"q": query, "num": num_results}
)
data = response.json()
return [
{"title": r["title"], "url": r["link"], "snippet": r.get("snippet", "")}
for r in data.get("organic", [])
]
async def read_pages(urls: list[str]) -> list[dict]:
"""Stage 2: Read full content from URLs"""
results = []
for url in urls:
try:
extracted = ks.extract(url)
if extracted.get("markdown"):
results.append({"url": url, "content": extracted["markdown"]})
except Exception as e:
print(f"Failed to read {url}: {e}")
return results
async def research_agent(question: str) -> str:
# Stage 1: Search
print(f'Searching: "{question}"')
search_results = await search_web(question, num_results=5)
# LLM selects which URLs to read
selection = await openai.chat.completions.create(
model="gpt-4o-mini",
response_format={"type": "json_object"},
messages=[
{
"role": "system",
"content": "Select the 2-3 most relevant URLs to read for answering the question. "
'Return JSON: {"urls": ["url1", "url2"]}'
},
{
"role": "user",
"content": f"Question: {question}\n\nResults:\n" + "\n\n".join(
f"- {r['title']}\n URL: {r['url']}\n Snippet: {r['snippet']}"
for r in search_results
)
}
]
)
selected_urls = json.loads(selection.choices[0].message.content)["urls"]
# Stage 2: Read pages
print(f"Reading {len(selected_urls)} pages...")
pages = await read_pages(selected_urls)
# Build context and answer
context = "\n\n---\n\n".join(
f"Source: {p['url']}\n\n{p['content'][:3000]}"
for p in pages
)
answer = await openai.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "Answer the question based on the web content. Cite sources."
},
{
"role": "user",
"content": f"Question: {question}\n\nWeb content:\n{context}"
}
]
)
return answer.choices[0].message.content
SERP API vs Content API: Feature Comparison
| Feature | SERP API | Content Scraping API |
|---|---|---|
| What you get | Title, URL, snippet | Full page markdown/HTML |
| Response time | 200ms - 1s | 1s - 10s |
| Cost per call | $0.001 - $0.005 | $0.002 - $0.010 |
| Use case | Finding relevant pages | Reading page content |
| JS rendering | Not applicable | Essential for SPAs |
| Structured extraction | Partial (snippets) | Full structured data |
| Anti-bot handling | Not applicable | Required |
| Freshness | Minutes (search index) | Real-time |
| Rate limits | Low (thousands/min) | Higher (due to browser) |
"Search-Then-Scrape" vs Unified APIs Like Perplexity
An alternative to the two-stage architecture is a unified search + answer API like Perplexity's API or Tavily. These return both search results and synthesized answers in a single call.
| Approach | Pros | Cons |
|---|---|---|
| Search-then-scrape (SERP + content API) | Full page content, customizable prompts, works offline | Two API calls, more latency |
| Unified API (Perplexity, Tavily) | Single call, fast, curated snippets | Snippets only (not full content), limited customization |
| Content API alone (no SERP) | Full content from known URLs | Cannot discover new pages |
Use the unified API approach when:
- You need fast answers from web search
- Snippet-level content is sufficient
- You are building a general-purpose chat agent
Use the search-then-scrape approach when:
- You need the full content of a page (not just snippets)
- You are building a domain-specific knowledge base
- You need structured data extraction from the scraped content
- You are doing research that requires reading complete articles or documentation
For most serious AI agent applications — documentation Q&A, competitive intelligence, regulatory compliance monitoring, technical research — you need the full content, not snippets. That is where a content scraping API becomes essential.
Choosing the Right SERP API
When picking a SERP API to combine with KnowledgeSDK, consider:
- Serper.dev — best price/performance for Google Search, good for rapid prototyping
- SerpApi — widest platform support (YouTube, Maps, Images, Shopping), good for diverse data types
- Brave Search API — independent index (not Google), useful when you want to avoid Google's bias toward large publishers
- Tavily — built specifically for LLM agents, returns curated snippets with relevance scores
Start Building
KnowledgeSDK handles Stage 2 of the pipeline — the full content extraction — so you can pair it with any SERP API you prefer for Stage 1. The /v1/extract endpoint returns clean LLM-ready markdown from any URL, with JavaScript rendering, anti-bot bypass, and automatic retry built in.
Get your API key at knowledgesdk.com and use it as the content layer in your search-then-scrape pipeline. The free tier includes 500 scrapes per month — enough to experiment with the full two-stage architecture before scaling.