Overview

Learn what Capture browser sessions are, when to use them, and how to create a session before running browser actions.

Browser sessions give you a stateful cloud browser that stays alive across multiple API calls. Instead of asking Capture to take one screenshot, PDF, or content extraction in a single request, you create a session, run actions against the same browser page, and close it when the workflow is finished.

Use sessions when a capture needs interaction or shared state: signing in, navigating through a multi-step flow, waiting for an application to update, querying page content after a click, or taking a screenshot after several browser actions.

How it works

Create a browser session with POST /v1/sessions.
Use the returned sessionId to execute actions with POST /v1/sessions/{sessionId}/actions.
Read session metadata when you need current state.
Close the session with DELETE /v1/sessions/{sessionId} when you are done.

Sessions automatically close when their TTL expires, but you should close them explicitly once your workflow is complete. Sessions are billed by duration: 1 credit per minute the session stays open (rounded up), charged when the session closes or expires — not per action. A session can stay open for up to 15 minutes, and you can run up to 5 at once. Each action response for an existing session includes expiresInSeconds so clients can renew or close proactively before the TTL is reached.

Authentication

Session endpoints use Bearer authentication:

Authorization: Bearer <base64(userId:secret)>

The secret can be your primary Capture secret or one of your API secrets.

Example workflow

Create a session:

curl -X POST "https://api.capture.page/v1/sessions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"maxTtlSeconds":300}'

Navigate the browser:

curl -X POST "https://api.capture.page/v1/sessions/{sessionId}/actions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "goto",
    "payload": {
      "url": "https://example.com",
      "viewport": { "width": 1440, "height": 900 }
    }
  }'

Take a screenshot from the same session:

curl -X POST "https://api.capture.page/v1/sessions/{sessionId}/actions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "screenshot",
    "payload": {
      "fullPage": true,
      "vw": 1440,
      "vh": 900
    }
  }'

Batch multi-step workflows

For multi-step workflows, prefer the batch action instead of sending each step as a separate HTTP request. Batched actions run sequentially inside the same session, keep the same browser state, and reduce repeated request, authentication, routing, and network overhead.

curl -X POST "https://api.capture.page/v1/sessions/{sessionId}/actions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "batch",
    "payload": {
      "actions": [
        {
          "type": "goto",
          "payload": {
            "url": "https://example.com",
            "waitUntil": "domcontentloaded"
          }
        },
        {
          "type": "wait_for_selector",
          "payload": { "selector": "body", "visible": true }
        },
        {
          "type": "scroll",
          "payload": { "direction": "down", "amount": 500 }
        },
        { "type": "title" },
        {
          "type": "screenshot",
          "payload": { "fullPage": false }
        }
      ]
    }
  }'

Use separate action requests when you need to inspect an intermediate result before deciding the next step. Otherwise, batching is the recommended pattern for predictable multi-action automations.

Viewport and device emulation

Browser sessions use a live page viewport. Set or update the viewport on goto, screenshot, or actions that use payload.screenshot: true.

Viewport action options:

{
  "viewport": { "width": 1440, "height": 900 },
  "scaleFactor": 1
}

Equivalent shorthand:

{
  "vw": 1440,
  "vh": 900,
  "deviceScaleFactor": 1
}

Device emulation uses the same device keys returned by /screenshot/devices:

{
  "emulateDevice": "iphone_14"
}

When emulateDevice is present, it configures the viewport, user agent, touch support, and scale factor for that device and takes precedence over explicit viewport dimensions. Viewport changes are stateful: if an action changes the viewport, later actions use the new viewport until you change it again.

Limits: viewport width and height can be up to 5000 and must be provided together; scaleFactor / deviceScaleFactor can be up to 3. Invalid device keys are rejected.

Close the session:

curl -X DELETE "https://api.capture.page/v1/sessions/{sessionId}" \
  -H "Authorization: Bearer <token>"

Connect over CDP

When you need direct Chrome DevTools Protocol access, create the session with cdp: true. Capture returns a connectUrl that CDP clients can use to attach to the same cloud browser.

curl -X POST "https://api.capture.page/v1/sessions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"cdp":true,"maxTtlSeconds":300}'

Use the returned session.connectUrl with a CDP client such as Puppeteer or Playwright:

import puppeteer from "puppeteer";

const browser = await puppeteer.connect({
  browserWSEndpoint: session.connectUrl,
});

import { chromium } from "playwright";

const browser = await chromium.connectOverCDP(session.connectUrl);

connectUrl values are minted fresh and are only returned for active CDP-enabled sessions. If a client disconnects and you need to reconnect, fetch the session again while it is active:

curl "https://api.capture.page/v1/sessions/{sessionId}" \
  -H "Authorization: Bearer <token>"

Disconnecting a CDP client does not close the Capture session. Close the session with DELETE /v1/sessions/{sessionId} when the workflow is finished so billing stops promptly.

CDP sessions can be combined with proxy: true to route both the Capture-owned page and CDP-created targets through your configured browser proxy:

curl -X POST "https://api.capture.page/v1/sessions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"cdp":true,"proxy":true,"maxTtlSeconds":300}'

CDP sessions cannot be combined with bypassBotDetection. If you need bot detection bypass, create a regular browser session and use Capture actions instead of a raw CDP connection.

Action types

Actions are deterministic operations executed inside the active session page. Use them to navigate, inspect content, query matching elements, capture screenshots, and continue a workflow without losing browser state.

The query action accepts payload.fields for built-in fields and payload.attributes for arbitrary DOM attributes such as href, class, or data-tab. Requested attributes are returned under each element's nested attributes object. payload.limit is capped at 500.

Selector-based click and hover responses include diagnostics about the matched element, including matched, visible, elementText, boundingBox, and the interaction point. Use these fields to detect a wrong selector or a likely no-op interaction without taking another screenshot.

Navigation actions (goto, back, forward, and reload) accept payload.waitUntil as load, domcontentloaded, or commit. The default is domcontentloaded.

The wait_for_timeout action accepts payload.timeoutMs and clamps values above 2500ms.

The content action accepts payload.format as text, markdown, or html. It defaults to cleaned readable text so agents do not receive raw page HTML unless you request html explicitly.

The batch action accepts up to 25 nested actions and executes them in order. Use it when the next steps are known ahead of time, such as navigating, waiting, scrolling, reading the title, and taking a screenshot.

See the generated API reference in this section for request and response schemas: