Webhook-Driven AI: How to Trigger Your LLM When a Website Changes

Most web data pipelines poll on a schedule. Here's how to build a reactive system that fires your AI workflow only when a monitored page actually changes — using webhooks.

Most web monitoring pipelines are built on polling: a cron job runs every hour, fetches a URL, compares it to the last version, and triggers downstream logic if something changed. It works, but it is wasteful. For every actual change you detect, you might run 23 fetches that return identical content.

The webhook model inverts this. Instead of your system constantly asking "did it change?", the monitoring system tells you when it does. Your pipeline runs only on actual changes. Here is how to build it.

Why Polling Is the Wrong Default

Consider monitoring 50 competitor pages at hourly intervals. That is 1,200 fetches per day. At a typical API cost of $0.001 per request, you are spending $1.20/day — $36/month — on network requests, most of which confirm that nothing changed.

For content that changes infrequently (documentation, pricing pages, feature announcements), the useful signal-to-noise ratio is very low.

The larger problem for AI workflows: polling on a schedule means your LLM processes stale data or redundant data. You either update the agent's knowledge base too frequently (expensive) or not frequently enough (stale retrieval results).

The Webhook Alternative

None of the major web data tools offer developer-facing webhooks for content change detection. Tavily, Exa, Bright Data, ZenRows, Diffbot — none of these have a "register a URL and get notified when it changes" API for developers. You typically have to build the diffing logic yourself on top of their scraping products.

KnowledgeSDK includes this as a first-class feature: POST /v1/webhooks registers a URL for monitoring. When the extracted content of that URL changes meaningfully, KnowledgeSDK sends a POST to your callback URL with the changed content and metadata.

The flow:

Register URL → KnowledgeSDK monitors → Content changes → POST to your callback URL → Your LLM workflow runs

Building the System

Step 1: Register URLs for Monitoring

import KnowledgeSDK from "@knowledgesdk/node";

const client = new KnowledgeSDK({ apiKey: process.env.KNOWLEDGESDK_API_KEY });

const pagesToMonitor = [
  "https://competitorA.com/pricing",
  "https://competitorA.com/changelog",
  "https://competitorB.com/pricing",
  "https://yourmarketleader.com/blog",
];

async function registerMonitoring(urls: string[]) {
  const webhooks = await Promise.all(
    urls.map((url) =>
      client.webhooks.create({
        url,
        callbackUrl: `${process.env.YOUR_APP_URL}/webhooks/content-changed`,
        events: ["content.changed"],
      })
    )
  );

  console.log(`Monitoring ${webhooks.length} URLs`);
  return webhooks;
}

await registerMonitoring(pagesToMonitor);

from knowledgesdk import KnowledgeSDK
import os

client = KnowledgeSDK(api_key=os.environ["KNOWLEDGESDK_API_KEY"])

pages_to_monitor = [
    "https://competitorA.com/pricing",
    "https://competitorA.com/changelog",
    "https://competitorB.com/pricing",
    "https://yourmarketleader.com/blog",
]

webhooks = [
    client.webhooks.create(
        url=url,
        callback_url=f"{os.environ['YOUR_APP_URL']}/webhooks/content-changed",
        events=["content.changed"],
    )
    for url in pages_to_monitor
]

print(f"Monitoring {len(webhooks)} URLs")

Step 2: Handle the Webhook Payload

KnowledgeSDK sends a POST request to your callback URL when content changes. The payload includes the URL, the event type, and the new content.

import express from "express";
import Anthropic from "@anthropic-ai/sdk";

const app = express();
app.use(express.json());

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

app.post("/webhooks/content-changed", async (req, res) => {
  // Acknowledge receipt immediately
  res.status(200).json({ received: true });

  const { url, event, content, previousContent } = req.body;

  if (event !== "content.changed") return;

  console.log(`Content changed: ${url}`);

  // Run your LLM workflow only on actual changes
  await processChange({ url, content, previousContent });
});

async function processChange({
  url,
  content,
  previousContent,
}: {
  url: string;
  content: string;
  previousContent: string;
}) {
  // Step 1: Re-index in your knowledge base
  await client.extract(url);

  // Step 2: Use LLM to summarize what changed and why it matters
  const response = await anthropic.messages.create({
    model: "claude-opus-4-6",
    max_tokens: 500,
    messages: [
      {
        role: "user",
        content: `A competitor page changed. Summarize the key differences and their business significance.

URL: ${url}

Previous content:
${previousContent.slice(0, 2000)}

New content:
${content.slice(0, 2000)}

What changed and why does it matter?`,
      },
    ],
  });

  const summary = response.content[0].type === "text" ? response.content[0].text : "";

  // Step 3: Store or send the intelligence digest
  await sendAlert({
    url,
    summary,
    timestamp: new Date().toISOString(),
  });
}

async function sendAlert(data: { url: string; summary: string; timestamp: string }) {
  // Send to Slack, email, or your internal system
  console.log("Change detected:", data);
}

app.listen(3000, () => console.log("Webhook server running on port 3000"));

Step 3: Make It Production-Ready

For production use, add request validation and ensure your webhook handler is resilient:

import crypto from "crypto";

function verifyWebhookSignature(body: string, signature: string, secret: string): boolean {
  const expected = crypto.createHmac("sha256", secret).update(body).digest("hex");
  return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
}

app.post("/webhooks/content-changed", express.raw({ type: "application/json" }), async (req, res) => {
  const signature = req.headers["x-knowledgesdk-signature"] as string;

  if (!verifyWebhookSignature(req.body.toString(), signature, process.env.WEBHOOK_SECRET!)) {
    return res.status(401).json({ error: "Invalid signature" });
  }

  res.status(200).json({ received: true });

  const payload = JSON.parse(req.body.toString());

  // Process asynchronously — do not block the webhook response
  setImmediate(() => processChange(payload));
});

Use Cases Where This Matters

Competitor pricing changes. When a competitor updates their pricing page, your AI agent gets new information within minutes — not at the next scheduled poll. Your competitive intelligence is always current.

Documentation updates. When a vendor updates their API documentation, your customer-facing agent that references those docs gets re-indexed automatically. No manual update cycle.

Job posting changes. When a company adds or removes job listings, your recruitment intelligence tool receives the update immediately and can generate a digest of hiring signal changes.

News and blog monitoring. Instead of polling 20 competitor blogs hourly, register them for webhook monitoring. Your summary agent runs only when new content actually appears.

The Cost Efficiency Argument

With polling at hourly intervals:

50 pages × 24 polls/day = 1,200 API calls/day
At $0.001/call: $1.20/day = $36/month
Most pages change once per day or less → most calls return unchanged data

With webhook monitoring:

0 polling calls
LLM workflow runs only when content actually changes
For a page that changes once per week: 4 LLM calls/month vs 720 polling calls/month

The savings compound as you scale the number of monitored pages.

Summary

Polling is a reasonable default when you are starting out and do not know how frequently content changes. Once you have data on change frequency, switching to a webhook model is almost always more efficient.

The key requirements for a webhook-driven web monitoring system: