Message Reference

Complete reference for all client and server messages in the VoiceRun Transcribe WebSocket API.

Client Messages#

session.update#

Configure the transcription session. Send this after receiving session.created.

Field	Type	Required	Description
`session.model`	string	Yes	STT model (e.g., "nova-3", "gpt-4o-transcribe")
`session.provider`	string	Yes	Provider: "DEEPGRAM", "OPENAI", "CARTESIA", "QWEN3"
`session.language`	string	No	BCP-47 language code. Default: "en". Use "multi" for auto-detect
`session.prompt`	string	No	Transcription hint (keywords for Deepgram, context prompt for OpenAI)
`session.input_audio_format`	string	No	Audio encoding: "pcm16" (default) or "mulaw"
`session.sample_rate`	integer	No	Sample rate in Hz. Default: 16000

{
  "type": "session.update",
  "session": {
    "model": "nova-3",
    "provider": "DEEPGRAM",
    "language": "en",
    "prompt": "insurance, policy, premium",
    "input_audio_format": "pcm16",
    "sample_rate": 16000
  }
}

audio.append#

Send an audio chunk for transcription. Audio must be base64-encoded.

Field	Type	Required	Description
`audio`	string	Yes	Base64-encoded audio bytes. Recommended chunk size: 20ms

{
  "type": "audio.append",
  "audio": "AAAA//8AAAEAAAD/////AAAB..."
}

session.close#

Gracefully close the transcription session.

{
  "type": "session.close"
}

Server Messages#

session.created#

Sent immediately after WebSocket connection is established.

{
  "type": "session.created",
  "session": {
    "id": "ts_a1b2c3d4e5f6"
  }
}

session.updated#

Confirms that the session configuration has been applied.

{
  "type": "session.updated",
  "session": {
    "model": "nova-3",
    "provider": "DEEPGRAM",
    "language": "en",
    "sample_rate": 16000,
    "input_audio_format": "pcm16"
  }
}

speech.started#

Voice Activity Detection (VAD) detected the start of speech.

{
  "type": "speech.started"
}

speech.stopped#

VAD detected the end of speech.

{
  "type": "speech.stopped"
}

transcription.partial#

Interim transcription result. Updated frequently as more audio is received.

{
  "type": "transcription.partial",
  "text": "hello how are"
}

transcription.completed#

Final transcription result for a complete speech segment.

{
  "type": "transcription.completed",
  "text": "hello how are you doing today",
  "language": "en"
}

session.closed#

Confirms the session has been closed.

{
  "type": "session.closed"
}

error#

Sent when an error occurs during the session.

{
  "type": "error",
  "error": {
    "code": "invalid_model",
    "message": "Unknown STT model: foo-bar"
  }
}

Error codes:

Code	Description
`invalid_message`	Malformed or unparseable message
`invalid_config`	Invalid session configuration
`invalid_model`	Unknown STT model
`invalid_provider`	Unknown or unsupported provider
`provider_error`	Error from the STT provider
`internal_error`	Internal server error