Declarative Resources

Anything under .voicerun/templates/ is rendered with Helm at vr release time and snapshotted onto the release manifest. Each YAML document declares one resource:

apiVersion: voicerun/v1 kind: Deployment | Simulation | Webhook | Evaluator metadata: name: <unique within the manifest> spec: ...

Preview rendered output with vr render; validate spec shape with vr validate. Values referenced as {{ .Values.foo }} come from .voicerun/values.yaml (and overlays); {{ .Agent.Name }} and friends come from .voicerun/agent.yaml. Secrets are referenced as {{ Secrets.organization.NAME }} — Helm leaves the placeholder intact and the API resolves it at session start.

metadata.name must be unique within a manifest for each kind. It's the handle used by other commands (e.g. vr simulate --name <…>, vr evaluation list --type <…>).

Deployment#

Runtime configuration for the agent in an environment. Every field is optional except kind/metadata.name — omitted fields take platform defaults. The mode field decides whether handler.py is required at the project root.

apiVersion: voicerun/v1 kind: Deployment metadata: name: my-agent-deployment spec: mode: coderunner # 'coderunner' (handler.py sandbox) or 'relay' region: us-central1-a variables: LOG_LEVEL: info FEATURE_FLAG: "true" stt: model: flux-general-en language: en failover: model: nova-3 turnTaking: mode: smart_turn externalEndpointing: 300 smartTurnVadStopSecs: 0.4 smartTurnStopSecs: 3.0 smartTurnTimeout: 5.0 tts: provider: cartesia model: sonic-2 voice: lyric language: en speed: 1.0 relay: url: wss://my-relay.example.com/ws/agent recording: enabled: false location: gs://my-bucket/recordings/ redaction: enabled: false tracing: enabled: true

Top-level fields#

FieldTypeDescription
modecoderunner | relayRuntime mode. coderunner (default) runs your handler.py in the sandbox. relay runs in voicerun-relay — no handler.py is required and vr validate skips the handler check automatically.
regionstringCluster region (e.g. us-central1-a).
variablesmap<string, string>Values injected into context.variables at session start. Merged with the org/agent-environment variable scopes.
relayobjectRelay endpoint config. Only used when mode: relay.
sttobjectSpeech-to-text config.
turnTakingobjectTurn-taking strategy. Sibling of stt/tts because the signal can come from STT (provider EoT), raw audio (Silero VAD, Smart Turn V3), or — eventually — semantic analyzers.
ttsobjectText-to-speech config.
recordingobjectCall recording config.
redactionobjectPII redaction applied to traces and session events.
tracingobjectDistributed tracing for the call pipeline.

spec.relay#

Only honored when mode: relay. Optional failover swaps to a backup relay endpoint on connection errors.

FieldTypeDescription
urlstringPrimary relay WebSocket URL (e.g. wss://relay.example.com/ws/agent).
failover.urlstringBackup relay WebSocket URL.

spec.stt#

FieldTypeDescription
modelstringSTT model identifier (e.g. flux-general-en, nova-3).
languagestringBCP-47 language code (e.g. en).
promptstringPrompt biasing for providers that support it.
filterstringProvider-specific filter string.
endpointingnumberProvider-side endpointing silence threshold (ms).
audioInputDelaynumberAudio input delay (ms) before transcription starts.
noiseReductionTypestringProvider noise-reduction profile.
eot.thresholdnumberDeepgram-style end-of-turn confidence threshold.
eot.timeoutMsnumberHard cap on EoT detection (ms).
eot.eagerThresholdnumberEager-EoT pre-confirmation threshold.
vad.modeserver_vad | semantic_vadOpenAI Realtime / Qwen3 VAD mode.
vad.eagernessauto | low | medium | highOpenAI semantic-VAD eagerness.
failoverobjectSame shape as stt (minus failover). Used on provider errors.

spec.turnTaking#

FieldTypeDescription
modeprovider | silero | smart_turnHow turn boundaries are decided. provider uses the STT provider's built-in endpointing. silero runs local Silero VAD on the relay/agent. smart_turn runs Silero VAD + Smart Turn V3 ML.
externalEndpointingnumberSilence-stop threshold (ms) for silero / smart_turn.
smartTurnVadStopSecsnumbersmart_turn only — VAD silence-stop window before the ML gates. Default 0.4.
smartTurnStopSecsnumbersmart_turn only — ML's per-window timeout. Default 3.0.
smartTurnTimeoutnumbersmart_turn only — hard cap on the ML's running window. Default 5.0.

spec.tts#

FieldTypeDescription
providerstringTTS provider (e.g. cartesia, elevenlabs).
modelstringProvider-specific model id (e.g. sonic-2).
voicestringProvider-specific voice id (e.g. lyric).
languagestringBCP-47 language code.
speednumberSpeech rate (provider-specific scale).
failoverobjectSame shape as tts (minus failover). Used on provider errors.

spec.recording#

FieldTypeDescription
enabledbooleanRecord the call audio. Default false.
locationstringStorage URI override (e.g. gs://my-bucket/recordings/). Leave unset to use the platform default.

spec.redaction#

FieldTypeDescription
enabledbooleanApply PII redaction to traces and session events. Default false.

spec.tracing#

FieldTypeDescription
enabledbooleanEmit distributed traces for the call pipeline. Default true.

Simulation#

A simulated caller used by vr simulate. The CLI submits the simulation's metadata.name; the API resolves spec from the active release's manifest, so the version that runs is always the released one, not whatever is on disk.

apiVersion: voicerun/v1 kind: Simulation metadata: name: happy-path spec: systemPrompt: | You are a customer calling this agent. Keep turns short and realistic. numberOfSimulations: 5 provider: gemini_live model: gemini-3.1-flash-live-preview voice: Aoede phases: - type: ring durationSecs: 8 - type: message text: "Thank you for calling. All representatives are busy. Please hold." - type: holdMusic durationSecs: 30 loopPhases: true humanPickupAfterSecs: 90

Top-level fields#

FieldTypeRequiredDescription
systemPromptstringyesPrompt driving the simulated user's behavior.
numberOfSimulationsinteger (1-100)noNumber of simulated sessions spawned per vr simulate invocation.
providergemini_live | openai_realtimenoPersona engine vendor. Defaults to gemini_live.
modelstringnoProvider-specific model id. For gemini_live: e.g. gemini-3.1-flash-live-preview (default). For openai_realtime: e.g. gpt-realtime (default), gpt-realtime-mini.
voicestringnoProvider-specific voice id. For gemini_live: Aoede, Puck, Charon, Kore, Fenrir, etc. For openai_realtime: alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar.
phasesPhaseSpec[]noPre-pickup phase script — see below.
loopPhasesbooleannoWhen phases is non-empty, restart the phase list when it ends. Default true.
humanPickupAfterSecsinteger (0-600)noSeconds of phase playback before the persona takes over. Omit to never auto-pickup (tests the agent's give-up logic).

spec.phases#

Phase entries play before the simulated persona starts speaking — useful for warm-transfer testing where the outbound leg waits through ringing/queue/IVR before someone "answers".

typeFieldsDescription
ringdurationSecs (int, 1-600)North-American ringback tone (440+480 Hz, 2s on / 4s off).
holdMusicdurationSecs (int, 1-600)Looping arpeggio that reads as hold music to a VAD.
messagetext (non-empty string)Pre-rendered automated-IVR speech (Gemini TTS).
ivrMenuprompt, options, optional timeoutSecs / maxRepeats / onNoInput / onInvalidRecursive IVR menu node. See below.

ivrMenu phase

FieldTypeDescription
promptstringThe IVR prompt the simulator plays.
optionsmap<DTMF digit, PhaseSpec[]>Single-character keys (0-9, *, #) mapped to the sub-phases that fire when the agent sends that digit. Phases can themselves be ivrMenu entries for multi-level trees.
timeoutSecsinteger (1-120)Seconds to wait for a digit after the prompt finishes. Default 8.
maxRepeatsinteger (0-10)Additional re-prompts when no digit arrives. Default 2.
onNoInputPhaseSpec[]Phases to run after maxRepeats re-prompts produce no input.
onInvalidPhaseSpec[]Phases to run when the agent sends a digit not in options.

Webhook#

Session-end webhook delivery configuration. The destination URL must be http(s).

apiVersion: voicerun/v1 kind: Webhook metadata: name: my-agent-webhook spec: url: https://example.com/voicerun/session-webhook events: - session.ended signingToken: "{{ Secrets.organization.WEBHOOK_SIGNING_TOKEN }}"

Top-level fields#

FieldTypeRequiredDescription
urlstring (http/https URL)yesDestination URL.
eventsstring[]yesEvent triggers this webhook listens for. Only session.ended is supported today; the list is intentionally small so adding a new event requires explicit code review.
signingTokenstringnoHMAC-SHA256 signing token for outgoing deliveries. Typically supplied via {{ Secrets.organization.NAME }} and resolved at consume time. When absent, the worker sends an unsigned request.

Evaluator#

Scores or extracts data from a session after it completes. Results are surfaced through vr evaluation list and vr evaluation info.

apiVersion: voicerun/v1 kind: Evaluator metadata: name: resolution-judge spec: title: Resolution Judge evalType: judge targetFormat: transcript systemPrompt: | Score the session 1-5 on whether the agent resolved the caller's request. Respond with a JSON object matching the response schema. responseSchema: type: object properties: score: { type: integer, minimum: 1, maximum: 5 } reasoning: { type: string } required: [score, reasoning] successCriteria: score: { ">=": 4 } apiProvider: openai model: gpt-4o

Common fields#

FieldTypeRequiredDescription
titlestringyesHuman-readable title (shown in evaluation listings).
evalTypejudge | extractionyesWhether this evaluator scores a session against criteria (judge) or extracts structured data from it (extraction).
targetFormatevents | transcriptnoWhat the evaluator sees. events passes the raw session-event stream; transcript passes a flattened user/agent transcript.
apiProviderstringnoLLM provider (e.g. openai, anthropic).
modelstringnoModel id within the provider (e.g. gpt-4o).

judge evaluators#

FieldTypeRequiredDescription
systemPromptstringyesInstructions for the judge model.
responseSchemaobjectyesJSON schema describing the judge's structured response.
successCriteriaobjectyesCriteria evaluated against the judge's response — drives the success flag on the resulting Evaluation.

extraction evaluators#

FieldTypeRequiredDescription
systemPromptstringyesInstructions describing what to extract.
responseSchemaobjectnoOptional JSON schema constraining the extracted payload.

Values and Secrets#

Values files#

.voicerun/values.yaml is the base values file used by Helm. Per-environment overlays (e.g. prod.yaml, staging.yaml) live alongside it and are pulled in with --values prod.yaml on vr release, vr render, or vr simulate.

# .voicerun/values.yaml variables: LOG_LEVEL: info region: us-central1-a stt: model: flux-general-en language: en tts: provider: cartesia model: sonic-2 voice: lyric recording: enabled: false tracing: enabled: true webhook: url: null # leave null to skip the Webhook resource entirely simulation: numberOfSimulations: 1 evaluator: apiProvider: openai model: gpt-4o

Secret placeholders#

Anywhere a string value lands in the rendered manifest, you can reference an organization secret as:

signingToken: "{{ Secrets.organization.WEBHOOK_SIGNING_TOKEN }}"

Helm leaves the placeholder intact through rendering. The API resolves it against organization secrets at session start, so secrets never round-trip through the release record itself. Create secrets with vr create secret.

Validation#

Both vr validate and vr render run shape-only validation against rendered manifests. The validator checks that:

  • Each document has a known kind (Deployment, Simulation, Webhook, Evaluator).
  • spec contains only the allowed top-level keys for that kind.
  • Required fields are present (e.g. Webhook.spec.url, Evaluator.spec.systemPrompt for judge).
  • Bounded fields are in range (e.g. numberOfSimulations is 1-100).
  • metadata.name is unique per kind within a manifest.

Validator-level checks don't hit the database, so they don't catch missing organization secrets or unknown providers — those are surfaced at vr release time when the API processes the manifest.

clitemplatesdeploymentsimulationwebhookevaluatorhelm