MCP (Model Context Protocol)

Connect Capture API to any MCP-compatible AI client (Claude, Cursor, Codex, OpenCode, VS Code, and more) using the Model Context Protocol. Generate screenshots, PDFs, extract content, and drive browser sessions directly from AI conversations.

What is MCP?

The Model Context Protocol (MCP) lets any MCP-compatible AI client connect directly to Capture's API, enabling you to capture screenshots, generate PDFs, extract content, and drive interactive browser sessions from your AI conversations.

Capture runs a remote MCP server over Streamable HTTP, so it works with any client that supports remote MCP servers — including Claude, Cursor, Codex, OpenCode, VS Code, and others.

Connection Details

Most MCP clients only need two things:

Field	Value
Server URL	`https://capture.page/mcp/v1`
Transport	Streamable HTTP
Auth header	`Authorization: Bearer YOUR_TOKEN_HERE`

Get Your Bearer Token

Visit the MCP Integration page in your Capture dashboard to get your auto-generated Bearer token. This token authenticates your MCP connection and combines your API key and secret.

Setup

Config formats differ between clients — use the example that matches yours and replace YOUR_TOKEN_HERE with your Bearer token. Any client not shown here can be configured with the Server URL and Authorization header from the table above, following its own MCP docs.

JSON config (Claude, Cursor, VS Code, and similar)

Paste this into your client's MCP config file (the exact location varies by client):

{
  "mcpServers": {
    "capture": {
      "type": "http",
      "url": "https://capture.page/mcp/v1",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN_HERE"
      }
    }
  }
}

Codex

Add this to ~/.codex/config.toml:

[mcp_servers.capture]
url = "https://capture.page/mcp/v1"
http_headers = { Authorization = "Bearer YOUR_TOKEN_HERE" }

After saving the configuration, restart your client (or start a new conversation) for the Capture tools to appear.

Available Tools

Once configured, you'll have access to these Capture tools directly from your AI assistant:

capture_screenshot

Capture screenshots of any website with full customization options:

Full-page screenshots
Device emulation (iPhone, iPad, etc.)
Dark mode support
Block ads and cookie banners
Custom viewport sizes
Element selection
And more

Example usage:

"Take a screenshot of example.com"
"Capture a full-page screenshot of news.ycombinator.com in dark mode"

capture_pdf

Generate PDFs from any website with custom settings:

Custom page sizes (A4, Letter, Legal, etc.)
Margin control
Landscape/portrait orientation
Print background graphics
Custom scaling

Example usage:

"Generate a PDF of this article"
"Create a PDF of example.com with A4 size"

capture_content

Extract readable content from any website:

Cleaned text content
Markdown output
Raw HTML when needed
Useful for content analysis and web scraping

Example usage:

"Extract the content from this blog post"
"Get the text from example.com"

capture_metadata

Extract metadata from any website:

Title and description
Open Graph tags
Author and publisher information
Useful for SEO analysis

Example usage:

"Extract metadata from apple.com"
"Get the SEO information for this website"

Browser sessions (interactive automation)

For tasks that need more than a single capture — logging in, filling forms, clicking through pages, or reading content that only appears after interaction — Capture exposes a stateful browser session that your AI assistant can drive step by step.

A session keeps a real browser open and is billed by duration — 1 credit per minute the session stays open (rounded up), charged when the session closes or expires — so the assistant is instructed to close it as soon as the task is done.

browser_session_create

Start an interactive browser session. Returns a sessionId and expiresAt. When cdp: true is set, it also returns a connectUrl for external CDP clients.

Optional maxTtlSeconds (max 900) to cap the session lifetime
Optional proxy to route through your configured proxy
Optional bypassBotDetection to use a stealth browser
Optional cdp to expose a Chrome DevTools Protocol connection URL

Use cdp: true when an external automation client needs to attach directly to the browser. Pass the returned connectUrl to Puppeteer connect or Playwright connectOverCDP, then close the Capture session with browser_session_close when finished. Disconnecting the CDP client does not close the Capture session.

cdp cannot be combined with proxy or bypassBotDetection.

browser_session_act

Run an action inside an open session. Takes the sessionId, an action type, and a payload.

Navigation: goto, back, forward, reload
Interaction: click, type, type_text, fill, select, check, press, hover, scroll, move_mouse, drag_mouse
Reading: snapshot, content, query, title, url, console, errors
Waiting: wait_for_selector, wait_for_timeout
Capture: screenshot (returned as an inline image; any action can set payload.screenshot: true to attach one)
batch: run several actions in one round-trip via payload.actions

goto, screenshot, and action payloads with screenshot: true can include viewport, vw / vh, scaleFactor / deviceScaleFactor, or emulateDevice to update the live session viewport before the action. This is stateful: later actions keep using that viewport until another action changes it.

Action results include expiresInSeconds when the session exists, letting the assistant renew, finish, or close the session before the TTL is reached. For query, use payload.fields for built-in fields and payload.attributes for arbitrary DOM attributes; requested attributes are returned under each element's nested attributes object. payload.limit is capped at 500. Selector click and hover results include matched element diagnostics such as visible, elementText, boundingBox, and the interaction point. For navigation actions, set payload.waitUntil to load, domcontentloaded, or commit; the default is domcontentloaded. For wait_for_timeout, payload.timeoutMs is capped at 2500ms.

For content, set payload.format to text, markdown, or html. The default is cleaned text; request html only when the raw page source is needed.

Example usage:

"Log into example.com with these credentials and take a screenshot of the dashboard"
"Open this product page, accept the cookie banner, and extract the price"

Ask your AI assistant naturally - it will know when to use Capture tools
All standard Capture options are supported through the MCP tools
Screenshots and PDFs are automatically uploaded to your Capture CDN
Credits are deducted from your Capture account as normal

Troubleshooting

Connection Failed

If your client reports that it failed to connect:

Verify the Server URL is https://capture.page/mcp/v1 and the transport is HTTP
Verify your Bearer token is correct
Ensure you've restarted your client after configuration
Check that you have an active internet connection

Authentication Errors

If you receive authentication errors:

Regenerate your Bearer token from the MCP Integration page
Update your configuration with the new token
Restart your client

Learn More

For more details about Capture's features and options:

MCP (Model Context Protocol)

What is MCP?

Connection Details

Get Your Bearer Token

Setup

JSON config (Claude, Cursor, VS Code, and similar)

Codex

Available Tools

capture_screenshot

capture_pdf

capture_content

capture_metadata

Browser sessions (interactive automation)

browser_session_create

browser_session_act

browser_session_close

browser_session_get

Usage Tips

Troubleshooting

Connection Failed

Authentication Errors

Learn More

On this page