The two transport paths
The transport path is the layer that owns the carrier trunk, the telephony codec negotiation, and the audio media path. Two transports are in production:Vendor-orchestrated transport
The original transport. A telephony-orchestration vendor owns the carrier trunk and the
speech-vendor integrations, and forwards each turn to Vorel’s agent over an LLM-proxy endpoint.
This is the transport Vorel shipped first; it remains in use for tenants that have not migrated
to the direct transport.
Direct transport (twilio-direct)
The direct-control transport. The carrier provides the media stream; Vorel’s voice-ws service
receives the raw audio frames, runs speech recognition and text-to-speech inside Vorel
infrastructure, and ships the synthesized audio back to the carrier. Vorel owns more of the
latency budget and more of the failure modes. This is the transport Skyline and other production
tenants run on today.
Direct transport architecture
Vendor-orchestrated transport architecture
Why chained, not speech-to-speech
Vorel deliberately runs a chained pipeline rather than a single speech-to-speech model. Chaining keeps each stage discrete and swappable, which is what makes the agent reliably call tools, ground its answers against your catalog and knowledge base, and enforce per-tenant guardrails on every turn. A single end-to-end speech model would couple recognition, reasoning, and synthesis to one vendor on the most caller-facing surface, a lock-in tradeoff Vorel has chosen not to take. The chained path is the load-bearing default and the architecture the latency work targets (the per-turn sub-2-second mandate is met on the chained stack).Per-tenant voice_provider setting
Each tenant carries a voice_provider setting that selects the transport path:
voice_provider value | Transport path |
|---|---|
| vendor-orchestrated | Vendor-orchestrated transport (legacy) |
twilio-direct | Direct transport (chained) |
tenants.voice_provider and is operator-flippable. Changing it requires a tenant-scoped cutover protocol (next section); operators do not flip it without running the protocol first.
Eval-gate at swap
Any transport swap on a production tenant runs the eval-gate before commit. The eval-gate asserts the new configuration does not regress against the tenant’s existing quality bar. The gate runs three checks against a 30-conversation eval set drawn from the tenant’s historical traffic:- Outcome correctness regression. The classifier outcomes on the eval set must match the existing pipeline’s outcomes within a tolerance threshold.
- First-token-to-speech p95 regression. The new pipeline’s p95 first-token-to-speech must be within 200ms of the existing pipeline’s p95.
- Barge-in success rate regression. The new pipeline’s barge-in handling must not regress more than 5% from the existing pipeline.
audit_log so the procurement signal “pipeline cutovers run quality regression checks” is auditable.
Shadow-mode-before-cutover
The voice pipeline cutover runbook requires every direct-transport tenant to run shadow against the existing transport for 7 days before flipping the live traffic. Shadow mode means:Live traffic continues on the existing transport
The tenant’s customers continue to receive responses from the existing transport. Customer-facing
behavior is unchanged.
Each turn is also dispatched to the candidate transport
For every live turn, voice-ws also dispatches the same input through the candidate transport
(twilio-direct). The candidate transport’s reply is captured but not returned to the customer.
Shadow comparisons land in `shadow_dispatch_comparisons`
The two replies are recorded side-by-side in a class-(b) telemetry table (retention 30 days per
ADR 0019). The candidate’s reply text, latency, and tool-call sequence are compared against the
live transport’s.
Operator reviews after 7 days
The operator inspects the shadow comparison data at the operator-side surface. If the candidate
transport matches the live transport within the acceptance threshold, the eval-gate is run for
final confirmation, and the cutover proceeds.
The shadow-comparison and migration scaffolding runs offline against recorded fixtures today. It
is the discipline that gates a cutover, not a claim that live shadow traffic is being compared in
production right now.
Tenant-side visibility
Tenants do not see the underlying transport directly; the dashboard surfaces conversation outcomes, not transport-layer detail. However, the eval-gate audit rows are visible on the tenant-side audit surface, so a tenant admin can see that on a given date their voice transport was cut over and the cutover passed the configured quality threshold.What does NOT change across transports
- The reasoning loop. Both transports run the same chained router → sub-agent → tool loop. The agent-level behavior is transport-independent.
- The tool layer. The same JWT-authed
/api/tools/*routes serve both transports. A tenant’s tool calls, CRM writes, and guardrails are transport-independent. - The CRM-as-SoR architectural commitment. Class-(c) writes go through the same CRM-write wrapper regardless of transport; retention windows are the same.
- Audit logging. Every turn, every tool call, every cutover, every override lands in
audit_logregardless of transport. - Per-tenant guardrails. Forbidden phrases, hallucination thresholds, handoff rules apply identically.
- Per-vertical packs. The qualification slots, the persona, the FAQ retrieval, and the booking handlers are all transport-agnostic.
Related docs
- Product → Voice: voice agent capabilities, per-vertical qualification rules, billing model.
- Getting started → How it works: full architecture, including the dispatch pipeline and tool layer.
- Security → Data retention:
shadow_dispatch_comparisons30-day window,voice_turn_latency90-day window.