Web Scraping

TinyWorkflows includes 5 web scraping nodes (the TinyScrape family) that let you extract data from any web page without writing code.

All TinyScrape nodes: Color = Orange (#F97316), Credits = 2 per run (except Web Search = 10 credits), have test modules.

Web Scraping nodes
5 web scraping nodes

Available nodes

NodeWhat it doesUse case
Scrape PageExtract content from a single URLGet article text, product info
Crawl SiteCrawl multiple pages from a domainIndex a blog, crawl documentation
Extract DataExtract structured fields from a page using AIGet pricing, contact info, product specs
Map URLsDiscover all URLs on a domainFind all pages before crawling
Web SearchSearch the web and return resultsResearch a topic, find competitors

Scrape Page

Extracts content from a single URL.

FieldTypeRequiredDescription
URLFX formulaYesThe page URL to scrape
FormatsMulti-selectNoOutput format: markdown (default), html, links
Only main contentBooleanNoStrip nav/footer/ads (default: true)

Templates: Get Page Content (markdown only), Full Page Data (markdown + html + links)

Crawl Site

Crawl Site node configuration showing URL, max pages, include/exclude paths
Crawl Site config: set starting URL, page limit, and path filters

Crawls multiple pages starting from a URL.

FieldTypeRequiredDescription
URLFX formulaYesStarting URL (domain root or specific path)
Max pagesFX formulaYesMaximum pages to crawl (default: 10)
Include pathsTextNoOnly crawl URLs matching this pattern (e.g., /blog/*)
Exclude pathsTextNoSkip URLs matching this pattern

Templates: Crawl Blog (auto-sets include to /blog/*), Crawl Documentation

Extract Data

Uses AI to extract structured fields from a page.

FieldTypeRequiredDescription
URLFX formulaYesThe page URL
SchemaField definitionsYesKey-value pairs defining what to extract (at least one)
PromptTextNoNatural language extraction instruction

Each schema field has a key (field name) and description (what to look for).

Templates: Extract Pricing (3 fields), Extract Contacts (name, email, phone, role)

Map URLs

Discovers all URLs on a domain.

FieldTypeRequiredDescription
URLFX formulaYesThe domain to map
SearchTextNoKeyword filter for discovered URLs
LimitFX formulaNoMax URLs to return (default: 100)

Searches the web and returns results.

FieldTypeRequiredDescription
QueryFX formulaYesSearch query
LimitFX formulaYesNumber of results (default: 10)
DomainTextNoRestrict search to a specific domain

Templates: Research Topic, Competitor Research

Tip

Chain scraping nodes: Map URLs → For Each → Scrape Page to crawl an entire site and extract content from each page.