Web Scraping

TinyWorkflows includes 5 web scraping nodes (the TinyScrape family) that let you extract data from any web page without writing code.

All TinyScrape nodes: Color = Orange (#F97316), Credits = 2 per run (except Web Search = 10 credits), have test modules.

Available nodes

Node	What it does	Use case
Scrape Page	Extract content from a single URL	Get article text, product info
Crawl Site	Crawl multiple pages from a domain	Index a blog, crawl documentation
Extract Data	Extract structured fields from a page using AI	Get pricing, contact info, product specs
Map URLs	Discover all URLs on a domain	Find all pages before crawling
Web Search	Search the web and return results	Research a topic, find competitors

Extracts content from a single URL.

Field	Type	Required	Description
URL	FX formula	Yes	The page URL to scrape
Formats	Multi-select	No	Output format: `markdown` (default), `html`, `links`
Only main content	Boolean	No	Strip nav/footer/ads (default: true)

Templates: Get Page Content (markdown only), Full Page Data (markdown + html + links)

Crawls multiple pages starting from a URL.

Field	Type	Required	Description
URL	FX formula	Yes	Starting URL (domain root or specific path)
Max pages	FX formula	Yes	Maximum pages to crawl (default: 10)
Include paths	Text	No	Only crawl URLs matching this pattern (e.g., `/blog/*`)
Exclude paths	Text	No	Skip URLs matching this pattern

Templates: Crawl Blog (auto-sets include to /blog/*), Crawl Documentation

Uses AI to extract structured fields from a page.

Field	Type	Required	Description
URL	FX formula	Yes	The page URL
Schema	Field definitions	Yes	Key-value pairs defining what to extract (at least one)
Prompt	Text	No	Natural language extraction instruction

Each schema field has a key (field name) and description (what to look for).

Templates: Extract Pricing (3 fields), Extract Contacts (name, email, phone, role)

Discovers all URLs on a domain.

Field	Type	Required	Description
URL	FX formula	Yes	The domain to map
Search	Text	No	Keyword filter for discovered URLs
Limit	FX formula	No	Max URLs to return (default: 100)

Searches the web and returns results.

Field	Type	Required	Description
Query	FX formula	Yes	Search query
Limit	FX formula	Yes	Number of results (default: 10)
Domain	Text	No	Restrict search to a specific domain

Templates: Research Topic, Competitor Research

Tip

Chain scraping nodes: Map URLs → For Each → Scrape Page to crawl an entire site and extract content from each page.