Guides

Running Extractions

A run is an extraction job. You select a robot, optionally provide start URLs and/or extraction budget, and Extralt crawls the site to produce captures.

Creating a run

• Dashboard

Navigate to Extract > Runs and click New Run, or click Start Run on any robot in the robots list.

Select the robot to use, enter your start URLs (one per line), and optionally set a budget to limit how many URLs the robot will extract. Click Start to begin the run.

New run form showing robot selector, URL input, and budget field

• API

export EXTRALT_API_KEY="your-api-key"

curl -s -X POST "https://api.extralt.com/runs" \
  -H "Authorization: Bearer $EXTRALT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "robotId": "your-robot-id",
    "urls": [
      "https://example-store.com/products/sneakers",
      "https://example-store.com/products/boots"
    ],
    "budget": 100
  }' | jq

Parameters

ParameterRequiredDescription
robotIdYesThe robot to use for extraction
urlsYesStart URLs to crawl
budgetNoMaximum number of URLs to extract. Each URL costs 1 credit.

Run lifecycle

StatusDescription
pendingRun is queued for execution
startingRun is starting
runningActively crawling and extracting
completedFinished successfully
failedEncountered an unrecoverable error
stoppedManually stopped

Monitoring a run

• Dashboard

Navigate to Extract > Runs to see all your runs in a sortable table.

Runs list showing completed, running, and pending runs

The table shows:

ColumnDescription
NameThe run name
RobotWhich robot is executing the run
StatusCurrent status with a color-coded badge
BudgetMaximum URLs to extract
ExtractedNumber of URLs extracted so far
QueueURLs remaining in the crawl queue
CreatedWhen the run was started
DurationHow long the run has been running or took to complete

You can take actions on runs directly from the table:

  • Stop a running or pending run to halt extraction early.
  • Restart a stopped or failed run, optionally with a new budget.

• API

Poll the run endpoint until it reaches a terminal status:

curl -s "https://api.extralt.com/runs/$RUN_ID" \
  -H "Authorization: Bearer $EXTRALT_API_KEY" | jq

See Common Patterns for a full polling example.

Concurrent run limits

PlanConcurrent runs
Start1
ScaleUp to 10

If you exceed your concurrent run limit, the run will be queued until a slot opens.

Downloading data

• Dashboard

Export captures directly from Extract > Captures. You can filter by run or robot, then download as JSONL or Parquet. See Working with Captures for details.

• API

The download endpoint returns a signed URL to a compressed .jsonl.lz4 file with all captures from the run. The URL is valid for 10 minutes.

curl -s "https://api.extralt.com/runs/$RUN_ID/download" \
  -H "Authorization: Bearer $EXTRALT_API_KEY" | jq '.url'

Recurring extractions

To automate extractions on a recurring cadence without manual intervention, set up a schedule. Schedules automatically create runs at the interval you specify.

See Schedules for the complete guide.