WMT 2014 English–German (WMT14 En→De, newstest2014).

WMT 2014 English–German (WMT14 En–De) is the English⇄German parallel data collection used in the Ninth Workshop on Statistical Machine Translation (WMT 2014) shared translation task. The corpus is a combination of multiple parallel sources commonly used in MT research (e.g., Europarl, Common Crawl, News Commentary, and other parallel collections) and is distributed with standard splits used for training, validation and testing. For the English→German task the training set contains on the order of ~4.5 million sentence pairs (this is the size reported and used in many papers, including “Attention Is All You Need”); commonly used validation/dev and test sets are newstest2013 (dev) and newstest2014 (test). The Hugging Face dataset card (wmt/wmt14) provides per-language-pair configs (e.g., de-en) and lists splits and sizes; it also includes a warning about issues in the Common Crawl portion (misaligned / non-English files). Typical preprocessing applied in literature includes tokenization and Byte-Pair Encoding (BPE) with a shared vocabulary (~37k) as used in the Transformer paper. Primary sources / references: the WMT14 workshop pages (statmt.org/wmt14) and the Hugging Face dataset card (https://huggingface.co/datasets/wmt/wmt14).

Paper ↗Submit a result ↵

§ 01 · Leaderboard

Best published scores.

No results indexed yet — be the first to submit a score.

No benchmark results indexed yet

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

WMT 2014 English–German (WMT14 En→De, newstest2014).

Best published scores.

Neighbouring benchmarks.

Have a score that beatsthis table?

Have a score that beats
this table?