Reliability: Retries & Fallbacks
The library includes built-in retry logic with exponential backoff and automatic fallback to alternative providers for handling failures.
Retries#
Basic Retry#
from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion, RetryConfiguration async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll automatically retry if there are any temporary failures.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": RetryConfiguration( enabled=True, max_retries=3, retry_delay=1.0, # Initial delay: 1 second backoff_multiplier=2.0 # Exponential: 1s, 2s, 4s ) }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )
Retry with Dictionary#
from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": { "max_retries": 5, "retry_delay": 0.5, "backoff_multiplier": 1.5 } }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )
Retry Configuration Options#
enabled(bool): Enable retry (default: True)max_retries(int): Maximum retry attempts (default: 3)retry_delay(float): Initial delay in seconds (default: 1.0)backoff_multiplier(float): Exponential backoff multiplier (default: 2.0)
Retry Behavior#
The retry logic follows exponential backoff:
- Attempt 1: Immediate
- Attempt 2: Wait
retry_delayseconds - Attempt 3: Wait
retry_delay * backoff_multiplierseconds - Attempt 4: Wait
retry_delay * backoff_multiplier^2seconds - And so on...
Example with default settings (retry_delay=1.0, backoff_multiplier=2.0):
- Attempt 1: Immediate
- Attempt 2: Wait 1 second
- Attempt 3: Wait 2 seconds
- Attempt 4: Wait 4 seconds
Streaming Retry Behavior#
Important: For streaming requests, retries only occur for initial connection failures. Once streaming begins, failures are not retried to prevent duplicate content:
# RETRIES (Before streaming starts): # - Connection failures # - Network errors # - Rate limits (429) # - Server errors (500, 502, 503) # - Authentication errors (401) # - Timeout errors # NO RETRY (After streaming starts): # - Mid-stream failures raise exceptions immediately # - This prevents duplicate content in real-time applications
Fallbacks#
Basic Fallback#
from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll try Anthropic first, but I have OpenAI as a backup.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" } ] }) # If Anthropic fails, automatically tries OpenAI if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )
Fallback Chain#
You can specify multiple fallbacks that are tried in order:
from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" }, { "provider": "google", "api_key": context.variables.get("GEMINI_API_KEY"), "model": "gemini-2.5-flash" } ] }) # Tries: Anthropic -> OpenAI -> Google if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )
Partial Fallback Overrides#
Fallbacks only override specified fields. Unspecified fields inherit from the original request:
response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "temperature": 0.7, "max_tokens": 1000, "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), # model, messages, temperature, max_tokens inherited from original } ] })
Streaming Fallbacks#
Fallbacks work with streaming too:
from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion_stream async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll stream responses with automatic fallback.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") stream = await generate_chat_completion_stream( request={ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" } ] }, stream_options={"stream_sentences": True, "clean_sentences": True} ) # If Anthropic connection fails, automatically tries OpenAI async for chunk in stream: if chunk.type == "content_sentence": yield TextToSpeechEvent( text=chunk.sentence, voice="kore" ) elif chunk.type == "response": complete_response = chunk.response
How Fallbacks Work#
- The library attempts the primary provider first
- If the primary provider fails (after retries), it tries the first fallback
- If that fails, it tries the next fallback, and so on
- If all providers fail, an exception is raised
Retries + Fallbacks#
Retries and fallbacks work together. Each provider (including fallbacks) will retry according to its retry configuration:
from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") # First retries Anthropic 3 times, then tries OpenAI (also with retries) response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": { "max_retries": 3 }, "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" # Inherits retry config from original request } ] }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )
Next Steps#
- Check out Examples for complete working examples
