knowledgesdk.com/glossary/webhook
Infrastructure & DevOpsbeginner

Also known as: HTTP callback, event webhook

Webhook

An HTTP callback that sends real-time event notifications from a server to a client-specified URL when something happens.

What Is a Webhook?

A webhook is an HTTP POST request that a server sends to a URL you specify whenever a particular event occurs. Instead of your application repeatedly asking "is it done yet?" (polling), the server calls you when it has something to report.

Webhooks are sometimes called "reverse APIs" because the flow is inverted: the API server initiates the request to your endpoint rather than the other way around.

Webhooks vs. Polling

Approach How It Works Trade-offs
Polling Client repeatedly calls GET /v1/jobs/{jobId} Simple, but wastes requests and adds latency
Webhook Server POSTs to your URL when done Efficient and real-time, but requires a public endpoint

KnowledgeSDK supports both patterns. You can poll GET /v1/jobs/{jobId} for async extraction status, or supply a callbackUrl to POST /v1/extract/async and receive a webhook when the job completes.

How a KnowledgeSDK Webhook Works

  1. You call POST /v1/extract/async with a callbackUrl:
    {
      "url": "https://example.com",
      "callbackUrl": "https://yourapp.com/webhooks/knowledge"
    }
    
  2. KnowledgeSDK starts a background job and immediately returns a jobId.
  3. When extraction finishes, KnowledgeSDK POSTs the result to your callbackUrl:
    {
      "jobId": "job_abc123",
      "status": "completed",
      "result": { "title": "...", "content": "..." }
    }
    
  4. Your endpoint returns 200 OK to acknowledge receipt.

Securing Webhooks with HMAC

Anyone on the internet can POST to your webhook URL. To verify that a request genuinely came from KnowledgeSDK and was not tampered with, check the HMAC signature included in the X-KnowledgeSDK-Signature header. See the HMAC glossary entry for implementation details.

Building a Reliable Webhook Receiver

  • Respond quickly. Return 200 OK as fast as possible — before doing any heavy processing. If your handler takes too long, the sender may time out and retry.
  • Process asynchronously. Enqueue the payload in your own job queue and process it in the background.
  • Handle duplicates. Webhook delivery is "at least once." The same event may arrive more than once. Use the jobId as an idempotency key to avoid double-processing.
  • Log raw payloads. Store the raw request body before parsing so you can replay events if your processing logic has a bug.
  • Return non-2xx only for genuine failures. Sending a 500 will cause the sender to retry, which is correct behavior. Sending 200 for every request, even ones you cannot process, suppresses useful retries.

Common Webhook Use Cases

  • Receiving completed AI extraction results from POST /v1/extract/async
  • Getting notified when a scheduled scrape job finishes
  • Syncing knowledge base updates to a downstream database or search index
  • Triggering Slack or email notifications when a job fails

Related Terms

Infrastructure & DevOpsintermediate
HMAC
Hash-based Message Authentication Code — a cryptographic signature used to verify that webhook payloads are authentic and untampered.
Infrastructure & DevOpsintermediate
Idempotency
The property of an API operation where making the same request multiple times produces the same result as making it once.
Infrastructure & DevOpsintermediate
Async API
An API design pattern where long-running operations return a job ID immediately and deliver results via polling or webhook when complete.
Infrastructure & DevOpsbeginner
Background Job
An asynchronous task that runs independently of the main request-response cycle, allowing long-running operations like web extraction to run without blocking.
Web ScrapingWorking Memory

Try it now

Build with Webhook using one API.

Extract, index, and search any web content. First 1,000 requests free.

GET API KEY →
← Back to glossary