Reliability: Retries & Fallbacks

The library includes built-in retry logic with exponential backoff and automatic fallback to alternative providers for handling failures.

Retries#

Basic Retry#

from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion, RetryConfiguration async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll automatically retry if there are any temporary failures.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": RetryConfiguration( enabled=True, max_retries=3, retry_delay=1.0, # Initial delay: 1 second backoff_multiplier=2.0 # Exponential: 1s, 2s, 4s ) }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )

Retry with Dictionary#

from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": { "max_retries": 5, "retry_delay": 0.5, "backoff_multiplier": 1.5 } }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )

Retry Configuration Options#

  • enabled (bool): Enable retry (default: True)
  • max_retries (int): Maximum retry attempts (default: 3)
  • retry_delay (float): Initial delay in seconds (default: 1.0)
  • backoff_multiplier (float): Exponential backoff multiplier (default: 2.0)

Retry Behavior#

The retry logic follows exponential backoff:

  • Attempt 1: Immediate
  • Attempt 2: Wait retry_delay seconds
  • Attempt 3: Wait retry_delay * backoff_multiplier seconds
  • Attempt 4: Wait retry_delay * backoff_multiplier^2 seconds
  • And so on...

Example with default settings (retry_delay=1.0, backoff_multiplier=2.0):

  • Attempt 1: Immediate
  • Attempt 2: Wait 1 second
  • Attempt 3: Wait 2 seconds
  • Attempt 4: Wait 4 seconds

Streaming Retry Behavior#

Important: For streaming requests, retries only occur for initial connection failures. Once streaming begins, failures are not retried to prevent duplicate content:

# RETRIES (Before streaming starts): # - Connection failures # - Network errors # - Rate limits (429) # - Server errors (500, 502, 503) # - Authentication errors (401) # - Timeout errors # NO RETRY (After streaming starts): # - Mid-stream failures raise exceptions immediately # - This prevents duplicate content in real-time applications

Fallbacks#

Basic Fallback#

from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll try Anthropic first, but I have OpenAI as a backup.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" } ] }) # If Anthropic fails, automatically tries OpenAI if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )

Fallback Chain#

You can specify multiple fallbacks that are tried in order:

from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" }, { "provider": "google", "api_key": context.variables.get("GEMINI_API_KEY"), "model": "gemini-2.5-flash" } ] }) # Tries: Anthropic -> OpenAI -> Google if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )

Partial Fallback Overrides#

Fallbacks only override specified fields. Unspecified fields inherit from the original request:

response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "temperature": 0.7, "max_tokens": 1000, "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), # model, messages, temperature, max_tokens inherited from original } ] })

Streaming Fallbacks#

Fallbacks work with streaming too:

from primfunctions.events import Event, StartEvent, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion_stream async def handler(event: Event, context: Context): if isinstance(event, StartEvent): yield TextToSpeechEvent( text="I'll stream responses with automatic fallback.", voice="kore" ) if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") stream = await generate_chat_completion_stream( request={ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" } ] }, stream_options={"stream_sentences": True, "clean_sentences": True} ) # If Anthropic connection fails, automatically tries OpenAI async for chunk in stream: if chunk.type == "content_sentence": yield TextToSpeechEvent( text=chunk.sentence, voice="kore" ) elif chunk.type == "response": complete_response = chunk.response

How Fallbacks Work#

  1. The library attempts the primary provider first
  2. If the primary provider fails (after retries), it tries the first fallback
  3. If that fails, it tries the next fallback, and so on
  4. If all providers fail, an exception is raised

Retries + Fallbacks#

Retries and fallbacks work together. Each provider (including fallbacks) will retry according to its retry configuration:

from primfunctions.events import Event, TextEvent, TextToSpeechEvent from primfunctions.context import Context from voicerun_completions import generate_chat_completion async def handler(event: Event, context: Context): if isinstance(event, TextEvent): user_message = event.data.get("text", "N/A") # First retries Anthropic 3 times, then tries OpenAI (also with retries) response = await generate_chat_completion({ "provider": "anthropic", "api_key": context.variables.get("ANTHROPIC_API_KEY"), "model": "claude-haiku-4-5", "messages": [{"role": "user", "content": user_message}], "retry": { "max_retries": 3 }, "fallbacks": [ { "provider": "openai", "api_key": context.variables.get("OPENAI_API_KEY"), "model": "gpt-4.1-mini" # Inherits retry config from original request } ] }) if response.message.content: yield TextToSpeechEvent( text=response.message.content, voice="kore" )

Next Steps#

  • Check out Examples for complete working examples
retriesfallbackserror-handling