Concepts

How Extralt Works

Two approaches dominate web data extraction today, and both have serious tradeoffs.

Traditional scrapers use hand-coded selectors (CSS, XPath) that break when a website changes its HTML structure. Maintenance is constant and scaling across many sites is expensive.

AI-at-runtime scrapers send each page to a language model for extraction. This is flexible but slow and expensive -- you're paying for LLM inference on every single page.

The third way

Extralt combines the best of both approaches:

  1. AI generates the crawler at build time. When you create a robot, Extralt's AI analyzes the target site, understands its structure, and generates extraction logic.

  2. The crawler compiles to Rust. The generated logic is compiled into a high-performance Rust binary (a dynlib). No LLM inference at extraction time.

  3. Extraction runs at native speed. The compiled robot extracts data as fast as it can fetch pages. The AI does the hard work once; the robot runs thousands of times.

This means you get the adaptability of AI (handles any site) with the speed of compiled code (no LLM cost per page).

Build vs. run

PhaseWhat happensSpeedCost
BuildAI analyzes site, generates crawler, compiles to Rust3-5 minutesOne-time
RunCompiled robot crawls and extracts dataFast (native speed)1 credit per URL

You pay the AI cost once during the build. Every run after that is fast and cheap.

Consistent output

All robots extract data into the same base ecommerce schema. Whether you're scraping a luxury fashion site or a hardware store, the output has the same structure: title, brand, variants, pricing, identifiers.

This consistency means you can:

  • Compare data across different sites without normalization
  • Build pipelines that work regardless of source
  • Switch between robots without changing your code

What's next