Skip to content
ActionElevenLabsUpdated May 2026

How do I transcribe audio with ElevenLabs?

Short answer: Drop the "ElevenLabsSpeech to Text" action anywhere in your workflow, map the inputs from upstream nodes, and publish.

Inputs

The fields this action accepts.

Every field can be mapped from an upstream trigger, AI step, table row, or hard-coded literal.

FieldTypeRequiredDescription
Audio File URL
file_url
stringRequiredAudio File URL (required)
Model
model_id
optionsOptionalModel. Options: Scribe v1
Language Code
language_code
stringOptionalen (auto-detect if blank)
Speaker Diarization
diarize
optionsOptionalSpeaker Diarization. Options: No, Yes
Timestamp Granularity
timestamps_granularity
optionsOptionalTimestamp Granularity. Options: None, Word, Character
Sample request
{
"file_url": "e.g. https://example.com/path",
"model_id": "{{trigger.model_id}}",
"language_code": "en (auto-detect if blank)",
"diarize": "{{trigger.diarize}}",
"timestamps_granularity": "{{trigger.timestamps_granularity}}"
}
Returns
{
"text": "Hello world",
"words": [
{
"end": 0.4,
"text": "Hello",
"type": "word",
"start": 0
}
],
"language_code": "en",
"language_probability": 0.99
}

Use these fields in downstream nodes for routing, logging, or error handling.

Triggered by

Apps that pair well as the trigger for Speech to Text.

Any of these apps can fire this action as part of a workflow.

FAQ

Questions about Speech to Text.

What does the Speech to Text action do in ElevenLabs?
Transcribes audio using ElevenLabs' speech recognition. While ElevenLabs is better known for TTS, their STT is competitive with Deepgram/AssemblyAI for specific use cases. Useful for unified ElevenLabs-only voice-agent workflows.
What inputs does Speech to Text require?
Required: Audio File URL. Every input accepts a static value or a variable from any upstream node in your workflow.
Can I use dynamic inputs from earlier workflow nodes?
Yes. Any field on this action can pull values from upstream nodes, whether that's a form response, a trigger payload, an AI output, or a lookup result.
What happens if ElevenLabs returns an error?
The workflow pauses on the failed node, the error message is captured in the run log, and you can retry the run with one click. Auto-retry policies are configurable per workflow with exponential backoff up to 5 attempts.
Does Speech to Text support batch operations?
Yes. Run Speech to Text inside a Loop node to process arrays. Tiny Command handles ElevenLabs's rate limits automatically so you don't have to throttle manually.
More actions

Other ElevenLabs actions.

Send speech to text from your workflows.

Triggered by anything in the catalog. Free tier available. No credit card.