Scrape Page

Scrape Page configuration
Scrape Page: set URL and extraction rules

Extracts content from a single web page. Returns the page as clean markdown, raw HTML, extracted links, or all three.

Type: TINY_SCRAPE Color: Orange (#F97316) Credits: 2 per run Tabs: Initialise → Configure → Test

Templates

TemplateOutput formatUse case
Get Page ContentMarkdown onlyArticle text, product descriptions
Full Page DataMarkdown + HTML + LinksComplete page extraction

Configure tab fields

FieldTypeRequiredDescription
URLFX formulaYesThe page to scrape. Supports variables: {{trigger.body.url}}
FormatsMulti-selectNoOutput formats: markdown (default), html, links
Only main contentBooleanNoStrip navigation, footer, ads, sidebar (default: true)

Output variables

VariableWhenWhat it contains
{{scrape.markdown}}formats includes "markdown"Clean text in markdown format
{{scrape.html}}formats includes "html"Raw HTML of the page
{{scrape.links}}formats includes "links"Array of all links found on the page

Common patterns

Content extraction

Webhook (URL) → Scrape Page (markdown) → TinyGPT (summarize) → Send Email

Price monitoring

Schedule (daily) → Scrape Page (product page) → TinyGPT (extract price) → 
  If-Else (price < threshold) → Send Alert
Scrape Page (links format) → For Each (links) → Scrape Page (each link) → 
  Array Aggregator → Process all content
Tip

Enable Only main content to strip navigation, sidebars, and footers. This gives cleaner text for AI processing and reduces token consumption.

Warning

Respect robots.txt and terms of service. Don't scrape sites that prohibit automated access. Rate-limit your requests with Delay nodes between scrape operations.