Speech Translation2019en

MuST-C English-German tst-COMMON

Multilingual Speech Translation Corpus built from TED talks. The English-German tst-COMMON split is the de-facto benchmark for end-to-end speech translation. BLEU on tst-COMMON is the primary metric.

Current State of the Art

SeamlessM4T v2 Large

Meta AI

37.1

bleu

MuST-C En-De tst-COMMON — bleu

3 results · 1 SOTA advances · higher is better

All results
SOTA frontier
25303520262027bleuSeamlessM4T v2 Large

bleu Progress Over Time

Showing 3 breakthroughs from Oct 2020 to Dec 2023

21.325.629.934.238.5Oct 2020May 2022Dec 2023bleuDate

Key Milestones

Oct 2020
Fairseq S2T (MuST-C)

Fairseq S2T conformer baseline on MuST-C En-De tst-COMMON. seed — verify.

22.7
Dec 2022
Whisper Large-v2

Whisper Large-v2 zero-shot speech translation, MuST-C En-De. seed — verify.

29.0
+27.8%
Dec 2023
SeamlessM4T v2 LargeCurrent SOTA

SeamlessM4T v2 Large, MuST-C En-De tst-COMMON BLEU. seed — verify.

37.1
+27.9%
Total Improvement
63.4%
Time Span
3y 3m
Breakthroughs
3
Current SOTA
37.1

Top Models Performance Comparison

Top 3 models ranked by bleu

bleu1SeamlessM4T v2 Large37.1100.0%2Whisper Large-v229.078.2%3Fairseq S2T (MuST-C)22.761.2%0%25%50%75%100%% of best
Best Score
37.1
Top Model
SeamlessM4T v2 Large
Models Compared
3
Score Range
14.4

bleuPrimary

#ModelScorePaper / CodeDate
1
SeamlessM4T v2 LargeOpen Source
Meta AI
37.1Apr 2026
2
Whisper Large-v2Open Source
OpenAI
29Apr 2026
3
Fairseq S2T (MuST-C)Open Source
Meta AI
22.7Apr 2026