Text to Speech

VoiceRun supports a wide variety of voices across multiple TTS providers including OpenAI, Azure, Google, Cartesia, ElevenLabs, Fish Audio, Gradium, Inworld, MiniMax, Qwen3, xAI Grok, and more.

Explore Available Voices#

Use TTS Lab to browse, preview, and test provider-native voice IDs before adding them to an agent. The Explorer tab lets you enter custom text, choose a provider, generate previews, and compare time-to-first-audio across multiple voices.

The Benchmarks tab shows production TTS latency by provider and model. It reports p50/p95 time-to-first-audio, p50/p95 total generation duration, and sample counts for selectable time ranges. Benchmark data is global across all agents because the underlying TTS histograms do not include organization or agent labels.

Voice Usage#

To use a voice in your agent, specify the voice name in a TextToSpeechEvent:

yield TextToSpeechEvent( text="Hello, this is a sample message!", voice="nova" )

Audio can be made uninterruptible by setting the interruptible flag to False:

yield TextToSpeechEvent( text="This message cannot be interrupted.", voice="nova", interruptible=False )

Using TextToSpeechIdentifier#

If you need to use a specific voice directly from a provider, you can use TextToSpeechIdentifier instead of a voice name string. This gives you direct access to any voice from a supported provider.

yield TextToSpeechEvent( text="Hello from Azure!", voice={"provider": "azure", "identifier": "en-AU-WilliamNeural"} )

Supported Providers#

ProviderExample IdentifierVoice Reference
azureen-AU-WilliamNeuralAzure Voice Gallery
cartesia6f84f4b8-58a2-430c-8c79-688dad597532Cartesia Voices
custommy_custom_voiceCustom Voices
elevenlabs21m00Tcm4TlvDq8ikWAMElevenLabs Voice Library
fish_audiod13f84b987ad4f22b56d2b47f4eb838eFish Audio Discovery
google_chirplaomedeiaGoogle Chirp HD
gradiumYTpq7expH9539ERJGradium Voice Library
inworldAlexInworld Platform
minimaxEnglish_Aussie_BlokeMiniMax Voices
openainovaOpenAI TTS
prim_voiceslyricTTS Lab
qwen3SerenaQwen TTS
xaievexAI Grok TTS

Examples#

# Azure Neural Voice yield TextToSpeechEvent( text="G'day mate!", voice={"provider": "azure", "identifier": "en-AU-WilliamNeural"} ) # Cartesia Voice (using voice ID) yield TextToSpeechEvent( text="Hello from Cartesia!", voice={"provider": "cartesia", "identifier": "6f84f4b8-58a2-430c-8c79-688dad597532"} ) # Google Chirp Voice yield TextToSpeechEvent( text="Hello from Google!", voice={"provider": "google_chirp", "identifier": "laomedeia"} ) # OpenAI Voice yield TextToSpeechEvent( text="Hello from OpenAI!", voice={"provider": "openai", "identifier": "nova"} ) # Custom Voice yield TextToSpeechEvent( text="Hello from my custom voice!", voice={"provider": "custom", "identifier": "my_voicerun_custom_voice"} ) # Fish Audio Voice yield TextToSpeechEvent( text="Hello from Fish Audio!", voice={"provider": "fish_audio", "identifier": "d13f84b987ad4f22b56d2b47f4eb838e"} ) # Gradium Voice yield TextToSpeechEvent( text="Hello from Gradium!", voice={"provider": "gradium", "identifier": "YTpq7expH9539ERJ"} ) # Inworld Voice yield TextToSpeechEvent( text="Hello from Inworld!", voice={"provider": "inworld", "identifier": "Alex"} ) # ElevenLabs Voice yield TextToSpeechEvent( text="Hello from ElevenLabs!", voice={"provider": "elevenlabs", "identifier": "21m00Tcm4TlvDq8ikWAM"} ) # xAI Grok Voice yield TextToSpeechEvent( text="Hello from xAI Grok!", voice={"provider": "xai", "identifier": "eve"} )

Provider Features#

FeatureOpenAIAzureGoogleCartesiaElevenLabsFish AudioGradiumInworldMiniMaxQwen3xAI Grok
StreamingYesYesNoYesYesYesYesNoYesYesNo
Voice InstructionsYesNoNoNoNoNoNoNoNoNoInline tags
Speed ControlYes (0.25-4x)Yes (0.5-3x)NoNoYesNoYesNoYesYesNo
CachingYesYesYesYesYesYesYesYesYesYesYes
Interruptible ControlYesYesYesYesYesYesYesYesYesYesYes
voicesttsaudio