Machine Translation
Machine translation is the oldest AI grand challenge, from rule-based systems in the 1950s to the transformer revolution sparked by "Attention Is All You Need" (2017) — literally the architecture that now powers all of AI. Google's multilingual T5 and Meta's NLLB-200 pushed translation to 200+ languages, but the real disruption came from GPT-4 and Claude matching or beating specialized MT systems on WMT benchmarks for high-resource pairs like English-German and English-Chinese. The unsolved frontier is low-resource languages (under 1M parallel sentences), where dedicated models like NLLB still dominate, and literary translation where preserving style, humor, and cultural nuance remains beyond any system. BLEU scores are increasingly seen as unreliable — human evaluation and newer metrics like COMET and BLEURT are becoming the standard.
WMT'23
State-of-the-art machine translation evaluation from WMT 2023 shared task
Top 10
Leading models on WMT'23.
What were you looking for on Machine Translation?
Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.
All datasets
2 datasets tracked for this task.
Related tasks
Other tasks in Natural Language Processing.
Didn't find what you came for?
Still looking for something on Machine Translation? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.
Real humans read every message. We track what people are asking for and prioritize accordingly.