Multilingual Speech Translation Corpus built from TED talks. The English-German tst-COMMON split is the de-facto benchmark for end-to-end speech translation. BLEU on tst-COMMON is the primary metric.
Bleu is the reported evaluation metric for MuST-C En-De tst-COMMON. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | SeamlessM4T v2 Large | paper | 37.1 | 2026 | Source ↗ | Looks wrong? |
| 02 | Whisper Large v2 | paper | 29 | 2026 | Source ↗ | Looks wrong? |
| 03 | Fairseq S2T (MuST-C) | paper | 22.7 | 2026 | Source ↗ | Looks wrong? |