Text to Speech

VoiceRun supports a wide variety of voices across multiple TTS providers including OpenAI, Azure, Google, Cartesia, ElevenLabs, Fish Audio, Gradium, Inworld, MiniMax, Qwen3, xAI Grok, and more.

Explore Available Voices#

Use TTS Lab to browse, preview, and test provider-native voice IDs before adding them to an agent. The Explorer tab lets you enter custom text, choose a provider, generate previews, and compare time-to-first-audio across multiple voices.

The Benchmarks tab shows production TTS latency by provider and model. It reports p50/p95 time-to-first-audio, p50/p95 total generation duration, and sample counts for selectable time ranges. Benchmark data is global across all agents because the underlying TTS histograms do not include organization or agent labels.

Voice Usage#

To use a voice in your agent, specify the voice name in a TextToSpeechEvent:

yield TextToSpeechEvent(
    text="Hello, this is a sample message!",
    voice="nova"
)

Audio can be made uninterruptible by setting the interruptible flag to False:

yield TextToSpeechEvent(
    text="This message cannot be interrupted.",
    voice="nova",
    interruptible=False
)

Using TextToSpeechIdentifier#

If you need to use a specific voice directly from a provider, you can use TextToSpeechIdentifier instead of a voice name string. This gives you direct access to any voice from a supported provider.

yield TextToSpeechEvent(
    text="Hello from Azure!",
    voice={"provider": "azure", "identifier": "en-AU-WilliamNeural"}
)

Supported Providers#

Provider	Example Identifier	Voice Reference
`azure`	`en-AU-WilliamNeural`	Azure Voice Gallery
`cartesia`	`6f84f4b8-58a2-430c-8c79-688dad597532`	Cartesia Voices
`custom`	`my_custom_voice`	Custom Voices
`elevenlabs`	`21m00Tcm4TlvDq8ikWAM`	ElevenLabs Voice Library
`fish_audio`	`d13f84b987ad4f22b56d2b47f4eb838e`	Fish Audio Discovery
`google_chirp`	`laomedeia`	Google Chirp HD
`gradium`	`YTpq7expH9539ERJ`	Gradium Voice Library
`inworld`	`Alex`	Inworld Platform
`minimax`	`English_Aussie_Bloke`	MiniMax Voices
`openai`	`nova`	OpenAI TTS
`prim_voices`	`lyric`	TTS Lab
`qwen3`	`Serena`	Qwen TTS
`xai`	`eve`	xAI Grok TTS

Examples#

# Azure Neural Voice
yield TextToSpeechEvent(
    text="G'day mate!",
    voice={"provider": "azure", "identifier": "en-AU-WilliamNeural"}
)

# Cartesia Voice (using voice ID)
yield TextToSpeechEvent(
    text="Hello from Cartesia!",
    voice={"provider": "cartesia", "identifier": "6f84f4b8-58a2-430c-8c79-688dad597532"}
)

# Google Chirp Voice
yield TextToSpeechEvent(
    text="Hello from Google!",
    voice={"provider": "google_chirp", "identifier": "laomedeia"}
)

# OpenAI Voice
yield TextToSpeechEvent(
    text="Hello from OpenAI!",
    voice={"provider": "openai", "identifier": "nova"}
)

# Custom Voice
yield TextToSpeechEvent(
    text="Hello from my custom voice!",
    voice={"provider": "custom", "identifier": "my_voicerun_custom_voice"}
)

# Fish Audio Voice
yield TextToSpeechEvent(
    text="Hello from Fish Audio!",
    voice={"provider": "fish_audio", "identifier": "d13f84b987ad4f22b56d2b47f4eb838e"}
)

# Gradium Voice
yield TextToSpeechEvent(
    text="Hello from Gradium!",
    voice={"provider": "gradium", "identifier": "YTpq7expH9539ERJ"}
)

# Inworld Voice
yield TextToSpeechEvent(
    text="Hello from Inworld!",
    voice={"provider": "inworld", "identifier": "Alex"}
)

# ElevenLabs Voice
yield TextToSpeechEvent(
    text="Hello from ElevenLabs!",
    voice={"provider": "elevenlabs", "identifier": "21m00Tcm4TlvDq8ikWAM"}
)

# xAI Grok Voice
yield TextToSpeechEvent(
    text="Hello from xAI Grok!",
    voice={"provider": "xai", "identifier": "eve"}
)

Provider Features#

Feature	OpenAI	Azure	Google	Cartesia	ElevenLabs	Fish Audio	Gradium	Inworld	MiniMax	Qwen3	xAI Grok
Streaming	Yes	Yes	No	Yes	Yes	Yes	Yes	No	Yes	Yes	No
Voice Instructions	Yes	No	No	No	No	No	No	No	No	No	Inline tags
Speed Control	Yes (0.25-4x)	Yes (0.5-3x)	No	No	Yes	No	Yes	No	Yes	Yes	No
Caching	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Interruptible Control	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes