Skip to content
ActionElevenLabsUpdated May 2026

How do I generate speech with ElevenLabs?

Short answer: Drop the "ElevenLabsText to Speech" action anywhere in your workflow, map the inputs from upstream nodes, and publish.

Inputs

The fields this action accepts.

Every field can be mapped from an upstream trigger, AI step, table row, or hard-coded literal.

FieldTypeRequiredDescription
Voice ID
voice_id
stringRequiredThe voice to use. Find voice IDs via the List Voices operation or your ElevenLabs dashboard.
Text
text
stringRequiredThe text to convert to speech (max 5000 chars for standard plan)
Model
model_id
optionsOptionalModel. Options: Multilingual v2 (highest quality), Turbo v2.5 (low latency), Monolingual v1 (English only)
Stability
stability
stringOptionalVoice stability (0.0 to 1.0). Lower = more expressive, higher = more consistent.
Similarity Boost
similarity_boost
stringOptionalVoice clarity and similarity (0.0 to 1.0). Higher = closer to original voice.
Output Format
output_format
optionsOptionalOutput Format. Options: MP3 (44.1kHz, 128kbps), MP3 (44.1kHz, 192kbps), PCM (16kHz), PCM (44.1kHz)
Sample request
{
"voice_id": "e.g. 21m00Tcm4TlvDq8ikWAM",
"text": "e.g. Hello, welcome to our platform. We're glad to have you here.",
"model_id": "{{trigger.model_id}}",
"stability": "e.g. 0.5",
"similarity_boost": "e.g. 0.75"
}
Returns
{
"note": "Binary audio data — pipe to a file or downstream service",
"content_type": "audio/mpeg"
}

Use these fields in downstream nodes for routing, logging, or error handling.

Triggered by

Apps that pair well as the trigger for Text to Speech.

Any of these apps can fire this action as part of a workflow.

FAQ

Questions about Text to Speech.

What does the Text to Speech action do in ElevenLabs?
Generates high-quality audio from text using ElevenLabs voices (stock or cloned). The premium-quality TTS option vs faster/cheaper alternatives — for broadcast-quality narration, audiobook production, or branded voice content.
What inputs does Text to Speech require?
Required: Voice ID, Text. Every input accepts a static value or a variable from any upstream node in your workflow.
Can I use dynamic inputs from earlier workflow nodes?
Yes. Any field on this action can pull values from upstream nodes, whether that's a form response, a trigger payload, an AI output, or a lookup result.
What happens if ElevenLabs returns an error?
The workflow pauses on the failed node, the error message is captured in the run log, and you can retry the run with one click. Auto-retry policies are configurable per workflow with exponential backoff up to 5 attempts.
Does Text to Speech support batch operations?
Yes. Run Text to Speech inside a Loop node to process arrays. Tiny Command handles ElevenLabs's rate limits automatically so you don't have to throttle manually.
More actions

Other ElevenLabs actions.

Send text to speech from your workflows.

Triggered by anything in the catalog. Free tier available. No credit card.