| ElevenLabs Turbo v2.5 | ElevenLabs | Cloud API | vendor reported | Proprietary (diffusion-based) | — | 4.8 | within MOS noise | 2024 |
| Sesame CSM | Sesame | Open Source | community reported | Conversational Speech Model | 1B+ | 4.7 | within MOS noise | 2025 |
| OpenAI TTS HD | OpenAI | Cloud API | vendor reported | Proprietary | — | 4.7 | within MOS noise | 2023 |
| Gemini 2.5 Pro TTS | Google | Cloud API | vendor reported | Multimodal LLM (native audio) | — | 4.7 | within MOS noise | 2025 |
| Cartesia Sonic 2 | Cartesia | Cloud API | vendor reported | State-space model | — | 4.7 | within MOS noise | 2025 |
| ElevenLabs Flash v2.5 | ElevenLabs | Cloud API | vendor reported | Proprietary (optimized) | — | 4.6 | reported MOS; no CodeSOTA CI | 2025 |
| PlayHT 3.0 | PlayHT | Cloud API | vendor reported | Proprietary | — | 4.6 | reported MOS; no CodeSOTA CI | 2025 |
| Fish Audio S2 Pro | Fish Audio | Open Source | paper reported | Dual-autoregressive transformer + RVQ audio codec | 5B | 4.6 | reported MOS; no CodeSOTA CI | 2026 |
| Orpheus TTS | Canopy Labs | Open Source | community reported | LLM-based (Llama backbone) | 3B | 4.6 | reported MOS; no CodeSOTA CI | 2025 |
| Gemini 2.5 Flash TTS | Google | Cloud API | vendor reported | Multimodal LLM (native audio) | — | 4.5 | reported MOS; no CodeSOTA CI | 2025 |
| Kokoro v1.0 | Hexgrad | Open Source | codesota measured | Lightweight autoregressive | 82M | 4.5 | no CI yet; measured run exposes sample count and artifacts | 2025 |
| XTTS v2 | Coqui | Open Source | paper reported | GPT-like + VITS decoder | 467M | 4.5 | reported MOS; no CodeSOTA CI | 2024 |
| Google Chirp 3 HD | Google | Cloud API | vendor reported | Generative (USM-based) | — | 4.4 | reported MOS; no CodeSOTA CI | 2025 |
| Gradium TTS | Gradium | Cloud API | codesota measured | Proprietary neural TTS | — | 4.4 | no CI yet; measured run exposes sample count and artifacts | 2026 |
| Fish Speech 1.5 | Fish Audio | Open Source | community reported | VQGAN + Transformer | 500M | 4.4 | reported MOS; no CodeSOTA CI | 2025 |
| F5-TTS | Shanghai AI Lab | Open Source | paper reported | Flow-matching (non-autoregressive) | 335M | 4.4 | reported MOS; no CodeSOTA CI | 2024 |
| Dia 1.6B | Nari Labs | Open Source | community reported | Transformer + non-verbal tokens | 1.6B | 4.3 | reported MOS; no CodeSOTA CI | 2025 |
| Spark-TTS | SparkAudio | Open Source | community reported | Controllable Transformer | 500M | 4.3 | reported MOS; no CodeSOTA CI | 2025 |
| Supertonic 3 | Supertone | Open Source | community reported | ONNX Runtime local inference | 99M | 4.2 | reported MOS; no CodeSOTA CI | 2026 |
| Parler-TTS | Hugging Face | Open Source | paper reported | Prompt-controlled Transformer | 880M | 4.1 | reported MOS; no CodeSOTA CI | 2025 |
| Piper | Rhasspy | Open Source | community reported | VITS (lightweight) | ~20M | 3.6 | reported MOS; no CodeSOTA CI | 2023 |