Realtime TTS 1.5 Max
Realtime voice-agent candidate; add API metadata, pricing, and CodeSOTA hard-text run.
These models are tracked for upcoming CodeSOTA runs. They are not ranked against measured rows, and they do not inherit vendor MOS until a source and verification tier are attached.
Realtime voice-agent candidate; add API metadata, pricing, and CodeSOTA hard-text run.
Realtime voice-agent candidate from the same Inworld model family; verify latency and pricing.
Google frontier TTS model with audio tags and broad language coverage; add exact API model ID.
ElevenLabs latest expressive synthesis model; needs shared-prompt samples and control-following run.
Stable May 2026 Cartesia snapshot; verify transcript following, language quality, and p95 latency.
OpenAI current TTS model for prompt-controlled realtime speech; add voices, latency, and cost.
Contextual TTS candidate; verify public access, cloning claims, and language coverage.
High-fidelity TTS candidate; verify emotion control, languages, and commercial terms.
Explicit pitch, whisper, speed, duration, tags.
Open multilingual speech model to test for hard-text and code-switching.
Qwen speech generation candidate; verify language breadth and licensing.
Open-weight TTS candidate; verify artifacts and serving path.
Small-model edge candidate for browser/mobile deployment.
Control stress test for exaggeration, CFG, temperature, reference audio.
Expressive long-form/dialogue candidate; watch for drift and repetitions.
Narration and character-voice candidate.
Scene/dialogue TTS candidate; needs artifact-backed eval.