Who leads the LibriSpeech benchmark?

Universal-1 currently leads LibriSpeech with a score of 1.60 on wer-test-clean (lower is better).

What is the state-of-the-art score on LibriSpeech?

The state-of-the-art result on LibriSpeech is 1.60 (wer-test-clean), achieved by Universal-1 as of 2026.

How many models are tracked on LibriSpeech?

Codesota tracks 98 models on LibriSpeech across 3 metrics.

When was the LibriSpeech leaderboard last updated?

The LibriSpeech leaderboard on Codesota includes results through 2026, with the earliest tracked result from 2020.

Codesota · Speech · Speech Recognition · LibriSpeechTasks/Speech/Speech Recognition

Speech Recognition · benchmark dataset · 2015 · EN

LibriSpeech ASR Corpus.

Name: LibriSpeech ASR Corpus Benchmark Results
Creator: Codesota
Published: 2020-01-01
License: https://creativecommons.org/licenses/by/4.0/

1000 hours of English speech from audiobooks. Standard benchmark for automatic speech recognition.

Paper ↗Download dataset Submit a result ↵

§ 01 · Leaderboard

Best published scores.

111 results indexed across 3 metrics. Shaded row marks current SOTA; ties broken by submission date.

Primary: wer-test-clean · lower is better
All metrics: wer, wer-test-clean, wer-test-other

wer

92 rows

#	Model	Org	Submitted	Paper / code	wer
01	Qwen3.5-Omni-Plus	—	Apr 2026	Qwen3.5-Omni Technical Report	1.11
02	Granite Speech 4.1 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	1.33
03	Audio Flamingo 3	—	Jul 2025	Audio Flamingo 3: Advancing Audio Intelligence with Full… · code	1.57
04	LongCat-Flash-Omni	—	Oct 2025	LongCat-Flash-Omni Technical Report · code	1.57
05	Parakeet-rnnt-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	1.62
06	Qwen3-ASR-1.7BOpen	Alibaba	Jan 2026	Qwen3-ASR Technical Report · code	1.63
07	Stt-2.6b-en	—	Sep 2024	Moshi: a speech-text foundation model for real-time dial… · code	1.70
08	CrisperWhisperOpen	nyrahealth	Aug 2024	CrisperWhisper: Accurate Timestamps on Verbatim Speech T… · code	1.82
09	Voxtral-Mini-3B-2507	—	Jul 2025	Voxtral	1.88
10	SYMPHONY-ASR	—	Jan 2026	pwc-dump	1.91
11	Wav2Vec 2.0 Large (LS-960)	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	2.00
12	Wav2Vec 2.0 Base	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	2.10
13	GLM-ASR-Nano-2512Open	Zhipu AI	Dec 2025	pwc-dump · code	2.15
14	VibeVoice-ASR-HF	—	Jan 2026	VIBEVOICE-ASR Technical Report	2.20
15	Distil-Whisper Large v3.5	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	2.37
16	Cohere Transcribe (Mar 2026)Open	Cohere	Mar 2026	pwc-dump	2.37
17	Parakeet-rnnt-1.1b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	2.50
18	Distil-Whisper Large v3	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	2.54
19	Parakeet-TDT-1.1BOpen	NVIDIA	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	2.60
20	Granite 4.0 1B SpeechOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	2.85
21	Granite Speech 3.3 8BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	2.86
22	Canary-1B-FlashOpen	NVIDIA	Mar 2025	Training and Inference Efficiency of Encoder-Decoder Spe…	2.87
23	Canary-1BOpen	NVIDIA	Feb 2024	pwc-dump	2.93
24	Distil-Whisper Large v2	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	2.94
25	Whisper Medium (English)	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	3.02
26	Whisper-small.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	3.05
27	Whisper Small (English)	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	3.05
28	Llama 3 Speech (70B)	—	Jul 2024	The Llama 3 Herd of Models · code	3.10
29	Llama 3 (405B, Instruct)	Meta	Jul 2024	The Llama 3 Herd of Models · code	3.10
30	Canary-Qwen-2.5BOpen	NVIDIA	Mar 2025	Training and Inference Efficiency of Encoder-Decoder Spe…	3.10
31	Parakeet-tdt-0.6b-v2	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	3.19
32	Granite Speech 3.3 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	3.26
33	Voxtral-Small-24B-2507Open	Mistral AI	Jul 2025	Voxtral	3.26
34	Moonshine-base	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	3.38
35	Distil-Whisper Small (English)	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	3.48
36	Parakeet-ctc-1.1b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	3.51
37	Canary-1b-v2	—	Aug 2025	pwc-dump	3.56
38	Parakeet-tdt-0.6b-v3	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	3.59
39	Distil-Whisper Medium (English)	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	3.69
40	Parakeet-ctc-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	3.80
41	Phi-4 Multimodal InstructOpen	Microsoft	Mar 2025	Phi-4-Mini Technical Report: Compact yet Powerful Multim…	3.82
42	Asr-wav2vec2-librispeech	—	Jun 2021	SpeechBrain: A General-Purpose Speech Toolkit · code	3.83
43	Lite-whisper-large-v3-acc	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	3.91
44	Whisper Large v3Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	3.91
45	Stt_en_fastconformer_transducer_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	3.97
46	Stt_en_fastconformer_ctc_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	4.04
47	Stt_en_conformer_ctc_large	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	4.15
48	Asr-conformer-loquacious	—	Feb 2025	pwc-dump	4.24
49	Whisper Large v3 TurboOpen	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	4.24
50	Whisper baseOpen	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	4.25
51	Canary-180M-FlashOpen	NVIDIA	Mar 2025	Training and Inference Efficiency of Encoder-Decoder Spe…	4.35
52	Lite-whisper-large-v3	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	4.40
53	Qwen3-ASR-0.6BOpen	Alibaba	Jan 2026	Qwen3-ASR Technical Report · code	4.45
54	SYMPHONY	—	Oct 2025	pwc-dump	4.48
55	Moonshine-streaming-tiny	—	Jan 2026	pwc-dump	4.50
56	Moonshine-tiny	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	4.55
57	Lite-whisper-large-v3-turbo-acc	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	4.60
58	Owsm_ctc_v4_1B	—	May 2025	OWSM v4: Improving Open Whisper-Style Speech Models via … · code	4.89
59	Moonshine Streaming MediumOpen	Useful Sensors	Jan 2026	pwc-dump	5.00
60	Distil-large-v3.5	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	5.04
61	Zipformer-transducer-XL-290M	—	Oct 2023	Zipformer: A faster and better encoder for automatic spe… · code	5.04
62	Whisper Large v2Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	5.14
63	Owsm_ctc_v3.1_1B	—	Jan 2024	OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code	5.15
64	Distil-large-v3	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	5.19
65	Lite-whisper-large-v3-fast	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	5.19
66	Parakeet-tdt_ctc-110m	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	5.22
67	Voxtral-Mini-4B-Realtime-2602Open	Mistral AI	Feb 2026	Voxtral Realtime	5.52
68	Whisper Large	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	5.54
69	Whisper Tiny (English)	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	5.66
70	Moshi ASR	—	Sep 2024	Moshi: a speech-text foundation model for real-time dial… · code	5.70
71	Whisper-medium.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	5.85
72	Moonshine-streaming-small	—	Jan 2026	pwc-dump	6.78
73	Distil-large-v2	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	6.84
74	Distil-small.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	7.73
75	Stt_en_conformer_ctc_small	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	7.92
76	Distil-medium.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	8.35
77	Niagara-38m-batch.en	—	Feb 2026	pwc-dump	9.35
78	Whisper-base.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	10.35
79	Niagara-19m-batch.en	—	Feb 2026	pwc-dump	11.20
80	Hubert-xlarge-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	12.22
81	Wav2vec2-large-960h-lv60-self	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	12.42
82	Wav2vec2-conformer-rel-pos-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	12.44
83	Wav2vec2-base-960h	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	12.53
84	Wav2vec2-conformer-rope-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	12.54
85	Mms-1b-all	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	12.63
86	Hubert-large-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	12.75
87	Data2vec-audio-large-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	12.94
88	Wav2vec2-large-robust-ft-libri-960h	—	Apr 2021	Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code	13.76
89	Whisper-tiny.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	15.45
90	wav2vec 2.0 Large (960h)Open	Meta AI	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	15.46
91	Data2vec-audio-base-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	15.48
92	Mms-1b-fl102	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	28.70

wer-test-clean· primary

10 rows

#	Model	Org	Submitted	Paper / code	wer-test-clean
01	Universal-1	AssemblyAI	Apr 2024	official	1.60
02	Parakeet-CTC-1.1BOpen	NVIDIA / Suno	Nov 2023	Parakeet: Efficient, Accurate Speech Recognition Adapted…	1.70
03	Conformer-CTC LargeOpen	NVIDIA / NeMo	Jan 2023	VALL-E: Neural Codec Language Models are Zero-Shot Text …	1.70
04	Canary-1BOpen	NVIDIA	Oct 2023	Canary: A Multilingual Speech Recognition Model	1.70
05	Whisper Large v3Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi…	1.80
06	wav2vec 2.0 Large (960h)Open	Meta AI	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of…	1.80
07	HuBERT Large (LS-960)Open	Meta AI	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b…	1.90
08	Google USM	Google	Mar 2023	Google USM: Scaling Automatic Speech Recognition Beyond …	2.00
09	Whisper Large v2Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi…	2.70
10	Pulse STT	Smallest AI	May 2026	official	3.22

wer-test-other

9 rows

#	Model	Org	Submitted	Paper / code	wer-test-other
01	Universal-1	AssemblyAI	Apr 2024	official	3.10
02	wav2vec 2.0 Large (960h)Open	Meta AI	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of…	3.30
03	Whisper Large v3Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi…	3.60
04	HuBERT Large (LS-960)Open	Meta AI	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b…	3.60
05	Canary-1BOpen	NVIDIA	Oct 2023	Canary: A Multilingual Speech Recognition Model	3.80
06	Google USM	Google	Mar 2023	Google USM: Scaling Automatic Speech Recognition Beyond …	4.10
07	Parakeet-CTC-1.1BOpen	NVIDIA / Suno	Nov 2023	Parakeet: Efficient, Accurate Speech Recognition Adapted…	4.20
08	Whisper Large v2Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi…	5.20
09	Pulse STT	Smallest AI	May 2026	official	5.83

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 03 · Progress

3 steps
of state of the art.

Each row below marks a model that broke the previous record on wer-test-clean. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Lower scores win. Each subsequent entry improved upon the previous best.

SOTA line · wer-test-clean

Jun 20, 2020wav2vec 2.0 Large (960h)Meta AI1.80
Jan 5, 2023Conformer-CTC LargeNVIDIA / NeMo1.70
Apr 3, 2024Universal-1AssemblyAI1.60

Fig 3 · SOTA-setting models only. 3 entries span Jun 2020 → Apr 2024.

§ 04 · Literature

34 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

Qwen3.5-Omni Technical Report
Apr 2026·Qwen3.5-Omni-Plus
arXiv ↗
Voxtral Realtime
Feb 2026·Voxtral-Mini-4B-Realtime-2602
arXiv ↗
Qwen3-ASR Technical Report
Jan 2026·Qwen3-ASR-0.6B, Qwen3-ASR-1.7B
arXiv ↗Code
VIBEVOICE-ASR Technical Report
Jan 2026·VibeVoice-ASR-HF
arXiv ↗
LongCat-Flash-Omni Technical Report
Oct 2025·LongCat-Flash-Omni
arXiv ↗Code
Voxtral
Jul 2025·Voxtral-Small-24B-2507, Voxtral-Mini-3B-2507
arXiv ↗
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
Jul 2025·Audio Flamingo 3
arXiv ↗Code
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning
May 2025·Owsm_ctc_v4_1B
arXiv ↗Code
Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities
May 2025·Granite Speech 3.3 2B, Granite Speech 3.3 8B, Granite 4.0 1B Speech +1
arXiv ↗
Training and Inference Efficiency of Encoder-Decoder Speech Models
Mar 2025·Canary-180M-Flash, Canary-Qwen-2.5B, Canary-1B-Flash
arXiv ↗
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Mar 2025·Phi-4 Multimodal Instruct
arXiv ↗
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Feb 2025·Lite-whisper-large-v3-fast, Lite-whisper-large-v3-turbo-acc, Lite-whisper-large-v3 +1
arXiv ↗Code
Moonshine: Speech Recognition for Live Transcription and Voice Commands
Oct 2024·Moonshine-tiny, Moonshine-base
arXiv ↗Code
Moshi: a speech-text foundation model for real-time dialogue
Sep 2024·Moshi ASR, Stt-2.6b-en
arXiv ↗Code
CrisperWhisper: Accurate Timestamps on Verbatim Speech Transcriptions
Aug 2024·CrisperWhisper
arXiv ↗Code
The Llama 3 Herd of Models
Jul 2024·Llama 3 Speech (70B), Llama 3 (405B, Instruct)
arXiv ↗Code
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Jan 2024·Owsm_ctc_v3.1_1B
arXiv ↗Code
Parakeet: Efficient, Accurate Speech Recognition Adapted for NVIDIA GPUs
Nov 2023·Parakeet-CTC-1.1B
arXiv ↗
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Nov 2023·Distil-medium.en, Distil-small.en, Distil-large-v2 +7
arXiv ↗Code
Zipformer: A faster and better encoder for automatic speech recognition
Oct 2023·Zipformer-transducer-XL-290M
arXiv ↗Code
Canary: A Multilingual Speech Recognition Model
Oct 2023·Canary-1B
arXiv ↗
Scaling Speech Technology to 1,000+ Languages
May 2023·Mms-1b-fl102, Mms-1b-all
arXiv ↗Code
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
May 2023·Stt_en_fastconformer_ctc_large, Stt_en_fastconformer_transducer_large, Parakeet-ctc-0.6b +3
arXiv ↗
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Apr 2023·Parakeet-tdt_ctc-110m, Parakeet-tdt-0.6b-v3, Parakeet-tdt-0.6b-v2 +1
arXiv ↗Code
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Mar 2023·Google USM
arXiv ↗
VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Jan 2023·Conformer-CTC Large
arXiv ↗
Robust Speech Recognition via Large-Scale Weak Supervision
Dec 2022·Whisper-tiny.en, Whisper-base.en, Whisper-medium.en +9
arXiv ↗Code
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Feb 2022·Data2vec-audio-base-960h, Data2vec-audio-large-960h
arXiv ↗Code
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Jun 2021·Hubert-large-ls960-ft, Hubert-xlarge-ls960-ft, HuBERT Large (LS-960)
arXiv ↗Code
SpeechBrain: A General-Purpose Speech Toolkit
Jun 2021·Asr-wav2vec2-librispeech
arXiv ↗Code
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
Apr 2021·Wav2vec2-large-robust-ft-libri-960h
arXiv ↗Code
fairseq S2T: Fast Speech-to-Text Modeling with fairseq
Oct 2020·Wav2vec2-conformer-rope-large-960h-ft, Wav2vec2-conformer-rel-pos-large-960h-ft
arXiv ↗Code
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Jun 2020·wav2vec 2.0 Large (960h), Wav2vec2-base-960h, Wav2vec2-large-960h-lv60-self +2
arXiv ↗Code
Conformer: Convolution-augmented Transformer for Speech Recognition
May 2020·Stt_en_conformer_ctc_small, Stt_en_conformer_ctc_large
arXiv ↗Code

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

LibriSpeech ASR Corpus.

Best published scores.

3 stepsof state of the art.

34 paperstied to this benchmark.

Neighbouring benchmarks.

Have a score that beatsthis table?

3 steps
of state of the art.

34 papers
tied to this benchmark.

Have a score that beats
this table?