Extract
Fetch any URL and receive clean, LLM-ready markdown along with page metadata and extracted links. KnowledgeSDK handles JavaScript rendering, anti-bot protection, and HTML cleanup automatically.
/v1/extractx-api-keyRequest body
urlstringrequiredThe URL to extract. Must be a valid HTTP or HTTPS URL.
Response
urlstringThe URL that was extracted (after any redirects).
markdownstringThe page content converted to clean markdown. Navigation, footers, scripts, and styles are automatically stripped.
titlestring | nullThe page title extracted from the <title> tag.
descriptionstring | nullThe page description extracted from the <meta name="description"> or <meta property="og:description"> tag.
linksstring[]An array of absolute URLs found on the page (up to 500 links).
durationMsnumberThe time in milliseconds the extraction took to complete.
Code examples
Example response
{
"url": "https://example.com",
"markdown": "# Example Domain\n\nThis domain is for use in illustrative examples in documents...",
"title": "Example Domain",
"description": "Example Domain for documentation purposes.",
"links": [
"https://www.iana.org/domains/example"
],
"durationMs": 1243
}Need to extract AND make content searchable? Use /v1/business instead. It runs the full pipeline -- extract, classify, index, and make content available via semantic search -- in a single API call.