- Integrations
- /
- Firecrawl
- /
- Actions
- /
- Crawl Site
ActionFirecrawlUpdated May 2026
How do I crawl an entire site with Firecrawl?
Short answer: Drop the "Firecrawl → Crawl Site" action anywhere in your workflow, map the inputs from upstream nodes, and publish.
Inputs
The fields this action accepts.
Every field can be mapped from an upstream trigger, AI step, table row, or hard-coded literal.
| Field | Type | Required | Description |
|---|---|---|---|
Starting URL url | string | Required | Starting URL the crawler begins from. Only pages under the same domain are followed by default. |
Crawl Prompt prompt | string | Optional | Optional natural-language crawl config. Example: "Only follow links under /blog and /changelog, skip anything with the word 'archive'." When set, Firecrawl uses it alongside the path filters below. |
Max Pages limit | number | Optional | Hard cap on how many pages the crawler will fetch. Higher = more cost + slower. |
Max Depth max_depth | number | Optional | How many link-hops away from the starting URL the crawler may go. Blank = unlimited (within `limit`). |
Crawl Entire Domain crawl_entire_domain | boolean | Optional | When true, follow links anywhere under the same domain (not just below the starting URL's path). Use with a tight `limit`. |
Include Paths include_paths | array | Optional | URL-path patterns to whitelist (e.g. ["/docs/.*", "/blog/.*"]). Leave empty to include everything. |
Exclude Paths exclude_paths | array | Optional | URL-path patterns to skip (e.g. ["/admin/.*", "/login"]). |
Formats formats | array | Optional | Output formats applied to each crawled page. Any combination of: markdown, html, rawHtml, links, screenshot, json. |
Only Main Content only_main_content | boolean | Optional | Per-page setting. Strip nav/footer/sidebar boilerplate from each crawled page's markdown. |
Sample request
{"url": "e.g. https://docs.example.com","prompt": "e.g. Crawl only the documentation pages, skip the marketing site","limit": "{{trigger.limit}}","max_depth": "{{trigger.max_depth}}","crawl_entire_domain": "{{trigger.crawl_entire_domain}}"}
Returns
{"id": "1a2b3c4d-1234-5678-9abc-def012345678","url": "https://api.firecrawl.dev/v2/crawl/1a2b3c4d-1234-5678-9abc-def012345678","success": true}
Use these fields in downstream nodes for routing, logging, or error handling.
Triggered by
Apps that pair well as the trigger for Crawl Site.
Any of these apps can fire this action as part of a workflow.
FAQ
Questions about Crawl Site.
What does the Crawl Site action do in Firecrawl?
Starting from a URL, Firecrawl follows links and fetches every page within scope. Returns markdown-formatted content of each. For "ingest a docs site into a RAG index" or "build a knowledge base from a brand's blog" workflows.
What inputs does Crawl Site require?
Required: Starting URL. Every input accepts a static value or a variable from any upstream node in your workflow.
Can I use dynamic inputs from earlier workflow nodes?
Yes. Any field on this action can pull values from upstream nodes, whether that's a form response, a trigger payload, an AI output, or a lookup result.
What happens if Firecrawl returns an error?
The workflow pauses on the failed node, the error message is captured in the run log, and you can retry the run with one click. Auto-retry policies are configurable per workflow with exponential backoff up to 5 attempts.
Does Crawl Site support batch operations?
Yes. Run Crawl Site inside a Loop node to process arrays. Tiny Command handles Firecrawl's rate limits automatically so you don't have to throttle manually.
More actions
Other Firecrawl actions.
Action
Extract Structured Data
Pass a URL and a schema; Firecrawl extracts matching fields. For "scrape product details from this page into our DB" workflows where you want typed JSON rather than raw HTML.
ActionGet Firecrawl Agent Result
Polls an agent task for completion and returns the result. For Firecrawl's agentic-scraping workflows that perform multi-step browse tasks.
ActionGet Crawl Status
Returns the current status of a running crawl — pages crawled, pages discovered, completion percentage. Poll until status=completed before consuming results.
ActionMap Site URLs
Returns the URL list of a site without fetching content — the fast preflight before deciding what to crawl. Useful for "find all pages under /blog" or "count pages on this competitor's site" inventory workflows.
ActionRun Firecrawl Agent
Runs an agentic browse task — Firecrawl's agent navigates the site and performs the configured extraction goal. For complex scraping that requires sequential page navigation or form interaction.
ActionScrape URL
Fetches and returns cleaned markdown for a single URL. Handles JavaScript rendering and bot evasion. The right tool for "extract content from this one URL for LLM consumption" workflows.
Send crawl site from your workflows.
Triggered by anything in the catalog. Free tier available. No credit card.