ActionFirecrawlUpdated May 2026

How do I crawl an entire site with Firecrawl?

Short answer: Drop the "Firecrawl → Crawl Site" action anywhere in your workflow, map the inputs from upstream nodes, and publish.

Use this action All 8 Firecrawl actions

Inputs

The fields this action accepts.

Every field can be mapped from an upstream trigger, AI step, table row, or hard-coded literal.

Field	Type	Required	Description
Starting URL url	string	Required	Starting URL the crawler begins from. Only pages under the same domain are followed by default.
Crawl Prompt prompt	string	Optional	Optional natural-language crawl config. Example: "Only follow links under /blog and /changelog, skip anything with the word 'archive'." When set, Firecrawl uses it alongside the path filters below.
Max Pages limit	number	Optional	Hard cap on how many pages the crawler will fetch. Higher = more cost + slower.
Max Depth max_depth	number	Optional	How many link-hops away from the starting URL the crawler may go. Blank = unlimited (within `limit`).
Crawl Entire Domain crawl_entire_domain	boolean	Optional	When true, follow links anywhere under the same domain (not just below the starting URL's path). Use with a tight `limit`.
Include Paths include_paths	array	Optional	URL-path patterns to whitelist (e.g. ["/docs/.", "/blog/."]). Leave empty to include everything.
Exclude Paths exclude_paths	array	Optional	URL-path patterns to skip (e.g. ["/admin/.*", "/login"]).
Formats formats	array	Optional	Output formats applied to each crawled page. Any combination of: markdown, html, rawHtml, links, screenshot, json.
Only Main Content only_main_content	boolean	Optional	Per-page setting. Strip nav/footer/sidebar boilerplate from each crawled page's markdown.

Sample request

{
  "url": "e.g. https://docs.example.com",
  "prompt": "e.g. Crawl only the documentation pages, skip the marketing site",
  "limit": "{{trigger.limit}}",
  "max_depth": "{{trigger.max_depth}}",
  "crawl_entire_domain": "{{trigger.crawl_entire_domain}}"
}

Returns

{
  "id": "1a2b3c4d-1234-5678-9abc-def012345678",
  "url": "https://api.firecrawl.dev/v2/crawl/1a2b3c4d-1234-5678-9abc-def012345678",
  "success": true
}

Use these fields in downstream nodes for routing, logging, or error handling.

Triggered by

Apps that pair well as the trigger for Crawl Site.

Any of these apps can fire this action as part of a workflow.

Google Sheets → Firecrawl

2 Google Sheets triggers

HubSpot → Firecrawl

18 HubSpot triggers

FAQ

Questions about Crawl Site.

What does the Crawl Site action do in Firecrawl?

Starting from a URL, Firecrawl follows links and fetches every page within scope. Returns markdown-formatted content of each. For "ingest a docs site into a RAG index" or "build a knowledge base from a brand's blog" workflows.

What inputs does Crawl Site require?

Required: Starting URL. Every input accepts a static value or a variable from any upstream node in your workflow.

Can I use dynamic inputs from earlier workflow nodes?

Yes. Any field on this action can pull values from upstream nodes, whether that's a form response, a trigger payload, an AI output, or a lookup result.

What happens if Firecrawl returns an error?

The workflow pauses on the failed node, the error message is captured in the run log, and you can retry the run with one click. Auto-retry policies are configurable per workflow with exponential backoff up to 5 attempts.

Does Crawl Site support batch operations?

Yes. Run Crawl Site inside a Loop node to process arrays. Tiny Command handles Firecrawl's rate limits automatically so you don't have to throttle manually.

More actions