knowledgesdk.com/blog/knowledgesdk-vercel-ai-sdk

integrationMarch 19, 2026·12 min read

Using KnowledgeSDK with Vercel AI SDK for Web-Aware Chat Apps

Build a Next.js chat app that scrapes URLs and searches knowledge using Vercel AI SDK tool calling and KnowledgeSDK, with full streaming support.

The Vercel AI SDK's tool calling API is one of the cleanest interfaces for giving language models access to external capabilities. You define tools as typed functions, the model decides when to call them, and the results flow back into the response — all with streaming support built in.

KnowledgeSDK pairs naturally with this model. Its three core endpoints — scrape, search, and extract — map directly to tools a chat application needs to give users answers grounded in live web data. This guide builds a complete Next.js chat application that can fetch and reason about any URL, search across a growing knowledge base, and stream responses back to users in real time.

What You'll Build

A Next.js App Router chat application with:

A streaming chat UI using Vercel AI SDK's useChat hook
Two AI tools: scrapeUrl and searchKnowledge
A route handler that streams tool calls and results
Automatic knowledge base growth as users scrape URLs

By the end, users can paste a URL into the chat, ask questions about it, and have the AI answer with live data — not stale training data.

Project Setup

npx create-next-app@latest web-aware-chat --typescript --app --tailwind
cd web-aware-chat
npm install ai @ai-sdk/openai @knowledgesdk/node zod

Set your environment variables in .env.local:

OPENAI_API_KEY=sk-...
KNOWLEDGESDK_API_KEY=knowledgesdk_live_...

Defining the KnowledgeSDK Tools

The Vercel AI SDK uses Zod schemas to define tool parameters. Create lib/tools.ts:

import { tool } from 'ai';
import { z } from 'zod';
import KnowledgeSDK from '@knowledgesdk/node';

const ks = new KnowledgeSDK({ apiKey: process.env.KNOWLEDGESDK_API_KEY! });

export const scrapeUrl = tool({
  description:
    'Fetch the content of a URL and return it as clean markdown. ' +
    'Use this when the user provides a URL or asks you to read a webpage.',
  parameters: z.object({
    url: z.string().url().describe('The URL to scrape'),
    reason: z
      .string()
      .optional()
      .describe('Why you are scraping this URL — for transparency to the user'),
  }),
  execute: async ({ url }) => {
    const result = await ks.extract({ url });
    return {
      url,
      markdown: result.markdown,
      title: result.title ?? url,
      wordCount: result.markdown.split(' ').length,
    };
  },
});

export const searchKnowledge = tool({
  description:
    'Search across all previously scraped web content using semantic search. ' +
    'Use this when the user asks a question that might be answered by content ' +
    'that has been scraped in this session or previously.',
  parameters: z.object({
    query: z.string().describe('The search query in natural language'),
    limit: z
      .number()
      .min(1)
      .max(10)
      .default(5)
      .optional()
      .describe('Number of results to return'),
  }),
  execute: async ({ query, limit = 5 }) => {
    const results = await ks.search({ query, limit });
    return {
      query,
      results: results.hits.map((hit) => ({
        url: hit.url,
        title: hit.title,
        excerpt: hit.content.slice(0, 500),
        score: hit.score,
      })),
    };
  },
});

export const extractStructured = tool({
  description:
    'Extract structured data from a URL using AI. Use this when the user ' +
    'asks to pull specific fields from a webpage, like pricing, team members, or features.',
  parameters: z.object({
    url: z.string().url().describe('The URL to extract from'),
    fields: z
      .array(z.string())
      .describe('List of field names to extract, e.g. ["price", "features", "teamSize"]'),
  }),
  execute: async ({ url, fields }) => {
    const schema = Object.fromEntries(fields.map((f) => [f, 'string']));
    const result = await ks.extract({ url, schema });
    return { url, data: result.data };
  },
});

The Route Handler

Create app/api/chat/route.ts:

import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
import { scrapeUrl, searchKnowledge, extractStructured } from '@/lib/tools';

export const maxDuration = 60; // Allow up to 60s for tool execution

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai('gpt-4o'),
    system: `You are a helpful research assistant with live web access.

When a user provides a URL, proactively scrape it to give them accurate, current information.
When a user asks a question about something you may have already scraped, search the knowledge base first.
Always cite the source URL when providing information from scraped content.
If you scraped content that is too long to include in full, summarize the most relevant parts.`,
    messages,
    tools: {
      scrapeUrl,
      searchKnowledge,
      extractStructured,
    },
    maxSteps: 5, // Allow multi-step tool use
  });

  return result.toDataStreamResponse();
}

The maxSteps: 5 setting is important. It allows the model to make multiple tool calls in sequence — for example: search the knowledge base, find a relevant URL, scrape it for more detail, then answer. Without this, the model is limited to one tool call per response.

The Chat UI

Create app/page.tsx:

'use client';

import { useChat } from 'ai/react';
import { useState } from 'react';

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } =
    useChat({ api: '/api/chat' });
  const [pendingUrl, setPendingUrl] = useState('');

  const handleUrlScrape = () => {
    if (!pendingUrl) return;
    handleSubmit(undefined, {
      data: { prefilledInput: `Please read and summarize: ${pendingUrl}` },
    });
    setPendingUrl('');
  };

  return (
    <div className="flex flex-col h-screen max-w-3xl mx-auto p-4">
      <h1 className="text-2xl font-bold mb-4">Web-Aware Chat</h1>

      {/* URL Quick-Scrape Bar */}
      <div className="flex gap-2 mb-4">
        <input
          type="url"
          placeholder="Paste a URL to add to knowledge base..."
          value={pendingUrl}
          onChange={(e) => setPendingUrl(e.target.value)}
          className="flex-1 border rounded px-3 py-2 text-sm"
        />
        <button
          onClick={handleUrlScrape}
          className="px-4 py-2 bg-blue-600 text-white rounded text-sm"
        >
          Scrape
        </button>
      </div>

      {/* Message Thread */}
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((message) => (
          <div key={message.id}>
            {message.role === 'user' ? (
              <div className="bg-blue-50 rounded-lg p-3 ml-8">
                <p className="text-sm font-medium text-blue-900">You</p>
                <p className="text-sm mt-1">{message.content}</p>
              </div>
            ) : (
              <div className="bg-white border rounded-lg p-3 mr-8">
                <p className="text-sm font-medium text-gray-900">Assistant</p>
                {/* Tool call indicators */}
                {message.toolInvocations?.map((tool) => (
                  <div
                    key={tool.toolCallId}
                    className="text-xs text-gray-400 mt-1 flex items-center gap-1"
                  >
                    <span>
                      {tool.toolName === 'scrapeUrl' && `Scraping ${tool.args?.url}...`}
                      {tool.toolName === 'searchKnowledge' && `Searching: "${tool.args?.query}"...`}
                      {tool.toolName === 'extractStructured' && `Extracting from ${tool.args?.url}...`}
                    </span>
                    {tool.state === 'result' && <span className="text-green-500">Done</span>}
                  </div>
                ))}
                <p className="text-sm mt-1 whitespace-pre-wrap">{message.content}</p>
              </div>
            )}
          </div>
        ))}
        {isLoading && (
          <div className="text-sm text-gray-400 animate-pulse">Thinking...</div>
        )}
      </div>

      {/* Input */}
      <form onSubmit={handleSubmit} className="flex gap-2">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask anything, or paste a URL..."
          className="flex-1 border rounded px-3 py-2"
          disabled={isLoading}
        />
        <button
          type="submit"
          disabled={isLoading}
          className="px-4 py-2 bg-gray-900 text-white rounded disabled:opacity-50"
        >
          Send
        </button>
      </form>
    </div>
  );
}

Streaming Tool Results to the UI

The Vercel AI SDK streams tool calls incrementally. The message.toolInvocations array on each assistant message contains:

state: 'call' — the model has decided to call a tool
state: 'partial-call' — tool arguments are still streaming
state: 'result' — the tool has executed and returned results

You can use these states to build rich loading indicators. For example, show a spinner while state === 'call' and a checkmark when state === 'result':

{tool.state === 'call' && <Spinner size="xs" />}
{tool.state === 'result' && <CheckIcon className="text-green-500" />}

The full tool result is available as tool.result when the state is 'result'. You can display scraped content inline, show search result cards, or render extracted data as a table.

Adding Knowledge Base Growth

Every scrape call automatically adds content to the KnowledgeSDK knowledge base tied to your API key. This means searches in future conversations can find content from previous ones.

To display what's in the knowledge base, add an endpoint:

// app/api/knowledge/route.ts
import KnowledgeSDK from '@knowledgesdk/node';

const ks = new KnowledgeSDK({ apiKey: process.env.KNOWLEDGESDK_API_KEY! });

export async function GET() {
  // Search with a broad query to see recent additions
  const results = await ks.search({ query: '*', limit: 20 });
  return Response.json({ items: results.hits });
}

Show this as a sidebar panel so users can see their accumulated knowledge base.

Handling Long Pages

Some pages are too long to fit into a single context window. KnowledgeSDK's scrape endpoint returns the full content, but you may want to truncate before passing to the model.

Update the scrapeUrl tool's execute function:

execute: async ({ url }) => {
  const result = await ks.extract({ url });
  const MAX_CHARS = 8000; // ~2000 tokens
  const truncated = result.markdown.length > MAX_CHARS;
  return {
    url,
    markdown: result.markdown.slice(0, MAX_CHARS),
    truncated,
    fullWordCount: result.markdown.split(' ').length,
    title: result.title ?? url,
  };
},

The model will note in its response when content was truncated and offer to search for specific sections.

Multi-Step Research Workflows

With maxSteps: 5, the model can chain tool calls to answer complex questions. For example:

User: "What are the differences between Anthropic's and OpenAI's API pricing?"

Model:

Calls searchKnowledge — finds nothing relevant (first time asking)
Calls scrapeUrl("https://anthropic.com/pricing") — scrapes Anthropic pricing
Calls scrapeUrl("https://openai.com/pricing") — scrapes OpenAI pricing
Synthesizes both into a comparison and answers

On the second time asking a similar question, step 1 would find cached results and steps 2-3 would be skipped.

Performance Considerations

Streaming latency: The first token reaches the user before tool calls complete. The Vercel AI SDK sends tool call indicators immediately, so users see activity within ~100ms even if the scrape takes 3 seconds.

Parallel tool calls: GPT-4o and Claude 3.5 Sonnet support parallel tool calls. If the model decides to scrape two URLs, it may call the tool twice in parallel. The Vercel AI SDK handles this automatically.

Caching: KnowledgeSDK caches scrape results. If two users ask about the same URL within the cache TTL, the second request is nearly instant.

Rate limiting: For production apps with many concurrent users, add request queuing or rate limiting middleware. Use the Vercel AI SDK's onChunk callback to track concurrent tool calls.

KnowledgeSDK vs. Direct Fetch for AI Apps

Consideration	Direct `fetch()` + cheerio	KnowledgeSDK
JavaScript-rendered pages	Fails	Works
Anti-bot protection	Blocked	Bypassed
Clean markdown output	Manual parsing	Automatic
Semantic search over results	Build yourself	Built-in
Structured extraction	Prompt engineering	Schema-based API
Setup time	Days	Minutes
Maintenance	Ongoing	Zero

For a prototype, fetch() is fine. For production where you need reliable data from arbitrary URLs, KnowledgeSDK eliminates the infrastructure work.

Deploying to Vercel

vercel deploy

Set your environment variables in the Vercel dashboard under Settings → Environment Variables. The maxDuration = 60 export in the route handler requests a 60-second function timeout — make sure your Vercel plan supports it (Pro and above).

For streaming to work correctly on Vercel Edge Runtime, switch the route to edge:

export const runtime = 'edge';
export const maxDuration = 60;

Edge functions support streaming natively and have lower cold start latency than Node.js functions.

FAQ

Does the AI always scrape when I send a URL? The model decides based on context. If you paste a URL and ask a question about it, the model will scrape. If you just mention a URL in passing without asking about its content, it may not. You can always be explicit: "Please scrape and read this URL: ..."

Can I limit which URLs the AI is allowed to scrape? Yes. Add URL validation in the tool's execute function before calling KnowledgeSDK:

execute: async ({ url }) => {
  const allowed = ['docs.mysite.com', 'api.mysite.com'];
  const hostname = new URL(url).hostname;
  if (!allowed.some((d) => hostname.endsWith(d))) {
    return { error: 'URL not in allowed domains' };
  }
  // ... scrape
}

How do I prevent the knowledge base from growing indefinitely? KnowledgeSDK stores content per API key. You can delete specific items via the API or rotate to a new API key for each user/session.

Can I use Claude instead of GPT-4o? Yes. Swap the model:

import { anthropic } from '@ai-sdk/anthropic';
const result = streamText({ model: anthropic('claude-opus-4-6'), ... });

Claude has excellent tool-calling capabilities and a large context window for handling long scraped pages.

Is there a rate limit on tool calls per conversation? The Vercel AI SDK's maxSteps parameter limits sequential tool calls per request (default: 1, set to 5 or more for research workflows). KnowledgeSDK's rate limits apply per API key per minute.

Build your first web-aware AI chat app in under an hour. Get your KnowledgeSDK API key at knowledgesdk.com/setup.

Try it now