Skip to content
ActionFirecrawlUpdated May 2026

How do I crawl an entire site with Firecrawl?

Short answer: Drop the "FirecrawlCrawl Site" action anywhere in your workflow, map the inputs from upstream nodes, and publish.

Inputs

The fields this action accepts.

Every field can be mapped from an upstream trigger, AI step, table row, or hard-coded literal.

FieldTypeRequiredDescription
Starting URL
url
stringRequiredStarting URL the crawler begins from. Only pages under the same domain are followed by default.
Crawl Prompt
prompt
stringOptionalOptional natural-language crawl config. Example: "Only follow links under /blog and /changelog, skip anything with the word 'archive'." When set, Firecrawl uses it alongside the path filters below.
Max Pages
limit
numberOptionalHard cap on how many pages the crawler will fetch. Higher = more cost + slower.
Max Depth
max_depth
numberOptionalHow many link-hops away from the starting URL the crawler may go. Blank = unlimited (within `limit`).
Crawl Entire Domain
crawl_entire_domain
booleanOptionalWhen true, follow links anywhere under the same domain (not just below the starting URL's path). Use with a tight `limit`.
Include Paths
include_paths
arrayOptionalURL-path patterns to whitelist (e.g. ["/docs/.*", "/blog/.*"]). Leave empty to include everything.
Exclude Paths
exclude_paths
arrayOptionalURL-path patterns to skip (e.g. ["/admin/.*", "/login"]).
Formats
formats
arrayOptionalOutput formats applied to each crawled page. Any combination of: markdown, html, rawHtml, links, screenshot, json.
Only Main Content
only_main_content
booleanOptionalPer-page setting. Strip nav/footer/sidebar boilerplate from each crawled page's markdown.
Sample request
{
"url": "e.g. https://docs.example.com",
"prompt": "e.g. Crawl only the documentation pages, skip the marketing site",
"limit": "{{trigger.limit}}",
"max_depth": "{{trigger.max_depth}}",
"crawl_entire_domain": "{{trigger.crawl_entire_domain}}"
}
Returns
{
"id": "1a2b3c4d-1234-5678-9abc-def012345678",
"url": "https://api.firecrawl.dev/v2/crawl/1a2b3c4d-1234-5678-9abc-def012345678",
"success": true
}

Use these fields in downstream nodes for routing, logging, or error handling.

Triggered by

Apps that pair well as the trigger for Crawl Site.

Any of these apps can fire this action as part of a workflow.

FAQ

Questions about Crawl Site.

What does the Crawl Site action do in Firecrawl?
Starting from a URL, Firecrawl follows links and fetches every page within scope. Returns markdown-formatted content of each. For "ingest a docs site into a RAG index" or "build a knowledge base from a brand's blog" workflows.
What inputs does Crawl Site require?
Required: Starting URL. Every input accepts a static value or a variable from any upstream node in your workflow.
Can I use dynamic inputs from earlier workflow nodes?
Yes. Any field on this action can pull values from upstream nodes, whether that's a form response, a trigger payload, an AI output, or a lookup result.
What happens if Firecrawl returns an error?
The workflow pauses on the failed node, the error message is captured in the run log, and you can retry the run with one click. Auto-retry policies are configurable per workflow with exponential backoff up to 5 attempts.
Does Crawl Site support batch operations?
Yes. Run Crawl Site inside a Loop node to process arrays. Tiny Command handles Firecrawl's rate limits automatically so you don't have to throttle manually.
More actions

Other Firecrawl actions.

Send crawl site from your workflows.

Triggered by anything in the catalog. Free tier available. No credit card.