Codesota · Speech · Speech Recognition · LibriSpeechTasks/Speech/Speech Recognition
Speech Recognition · benchmark dataset · 2015 · EN

LibriSpeech ASR Corpus.

1000 hours of English speech from audiobooks. Standard benchmark for automatic speech recognition.

Paper Download datasetSubmit a result
§ 01 · Leaderboard

Best published scores.

111 results indexed across 3 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
wer-test-clean · lower is better
All metrics
wer, wer-test-clean, wer-test-other
wer
92 rows
#ModelOrgSubmittedPaper / codewer
01Qwen3.5-Omni-PlusApr 2026Qwen3.5-Omni Technical Report1.11
02Granite Speech 4.1 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…1.33
03Audio Flamingo 3Jul 2025Audio Flamingo 3: Advancing Audio Intelligence with Full… · code1.57
04LongCat-Flash-OmniOct 2025LongCat-Flash-Omni Technical Report · code1.57
05Parakeet-rnnt-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…1.62
06Qwen3-ASR-1.7BOpenAlibabaJan 2026Qwen3-ASR Technical Report · code1.63
07Stt-2.6b-enSep 2024Moshi: a speech-text foundation model for real-time dial… · code1.70
08CrisperWhisperOpennyrahealthAug 2024CrisperWhisper: Accurate Timestamps on Verbatim Speech T… · code1.82
09Voxtral-Mini-3B-2507Jul 2025Voxtral1.88
10SYMPHONY-ASRJan 2026pwc-dump1.91
11Wav2Vec 2.0 Large (LS-960)Jun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code2.00
12Wav2Vec 2.0 BaseJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code2.10
13GLM-ASR-Nano-2512OpenZhipu AIDec 2025pwc-dump · code2.15
14VibeVoice-ASR-HFJan 2026VIBEVOICE-ASR Technical Report2.20
15Distil-Whisper Large v3.5Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code2.37
16Cohere Transcribe (Mar 2026)OpenCohereMar 2026pwc-dump2.37
17Parakeet-rnnt-1.1bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…2.50
18Distil-Whisper Large v3Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code2.54
19Parakeet-TDT-1.1BOpenNVIDIAApr 2023Efficient Sequence Transduction by Jointly Predicting To… · code2.60
20Granite 4.0 1B SpeechOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…2.85
21Granite Speech 3.3 8BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…2.86
22Canary-1B-FlashOpenNVIDIAMar 2025Training and Inference Efficiency of Encoder-Decoder Spe…2.87
23Canary-1BOpenNVIDIAFeb 2024pwc-dump2.93
24Distil-Whisper Large v2Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code2.94
25Whisper Medium (English)Dec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code3.02
26Whisper-small.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code3.05
27Whisper Small (English)Dec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code3.05
28Llama 3 Speech (70B)Jul 2024The Llama 3 Herd of Models · code3.10
29Llama 3 (405B, Instruct)MetaJul 2024The Llama 3 Herd of Models · code3.10
30Canary-Qwen-2.5BOpenNVIDIAMar 2025Training and Inference Efficiency of Encoder-Decoder Spe…3.10
31Parakeet-tdt-0.6b-v2Apr 2023Efficient Sequence Transduction by Jointly Predicting To… · code3.19
32Granite Speech 3.3 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…3.26
33Voxtral-Small-24B-2507OpenMistral AIJul 2025Voxtral3.26
34Moonshine-baseOct 2024Moonshine: Speech Recognition for Live Transcription and… · code3.38
35Distil-Whisper Small (English)Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code3.48
36Parakeet-ctc-1.1bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…3.51
37Canary-1b-v2Aug 2025pwc-dump3.56
38Parakeet-tdt-0.6b-v3Apr 2023Efficient Sequence Transduction by Jointly Predicting To… · code3.59
39Distil-Whisper Medium (English)Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code3.69
40Parakeet-ctc-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…3.80
41Phi-4 Multimodal InstructOpenMicrosoftMar 2025Phi-4-Mini Technical Report: Compact yet Powerful Multim…3.82
42Asr-wav2vec2-librispeechJun 2021SpeechBrain: A General-Purpose Speech Toolkit · code3.83
43Lite-whisper-large-v3-accFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code3.91
44Whisper Large v3OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code3.91
45Stt_en_fastconformer_transducer_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…3.97
46Stt_en_fastconformer_ctc_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…4.04
47Stt_en_conformer_ctc_largeMay 2020Conformer: Convolution-augmented Transformer for Speech … · code4.15
48Asr-conformer-loquaciousFeb 2025pwc-dump4.24
49Whisper Large v3 TurboOpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code4.24
50Whisper baseOpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code4.25
51Canary-180M-FlashOpenNVIDIAMar 2025Training and Inference Efficiency of Encoder-Decoder Spe…4.35
52Lite-whisper-large-v3Feb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code4.40
53Qwen3-ASR-0.6BOpenAlibabaJan 2026Qwen3-ASR Technical Report · code4.45
54SYMPHONYOct 2025pwc-dump4.48
55Moonshine-streaming-tinyJan 2026pwc-dump4.50
56Moonshine-tinyOct 2024Moonshine: Speech Recognition for Live Transcription and… · code4.55
57Lite-whisper-large-v3-turbo-accFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code4.60
58Owsm_ctc_v4_1BMay 2025OWSM v4: Improving Open Whisper-Style Speech Models via … · code4.89
59Moonshine Streaming MediumOpenUseful SensorsJan 2026pwc-dump5.00
60Distil-large-v3.5Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code5.04
61Zipformer-transducer-XL-290MOct 2023Zipformer: A faster and better encoder for automatic spe… · code5.04
62Whisper Large v2OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code5.14
63Owsm_ctc_v3.1_1BJan 2024OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code5.15
64Distil-large-v3Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code5.19
65Lite-whisper-large-v3-fastFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code5.19
66Parakeet-tdt_ctc-110mApr 2023Efficient Sequence Transduction by Jointly Predicting To… · code5.22
67Voxtral-Mini-4B-Realtime-2602OpenMistral AIFeb 2026Voxtral Realtime5.52
68Whisper LargeDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code5.54
69Whisper Tiny (English)Dec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code5.66
70Moshi ASRSep 2024Moshi: a speech-text foundation model for real-time dial… · code5.70
71Whisper-medium.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code5.85
72Moonshine-streaming-smallJan 2026pwc-dump6.78
73Distil-large-v2Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code6.84
74Distil-small.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code7.73
75Stt_en_conformer_ctc_smallMay 2020Conformer: Convolution-augmented Transformer for Speech … · code7.92
76Distil-medium.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code8.35
77Niagara-38m-batch.enFeb 2026pwc-dump9.35
78Whisper-base.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code10.35
79Niagara-19m-batch.enFeb 2026pwc-dump11.20
80Hubert-xlarge-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code12.22
81Wav2vec2-large-960h-lv60-selfJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code12.42
82Wav2vec2-conformer-rel-pos-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code12.44
83Wav2vec2-base-960hJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code12.53
84Wav2vec2-conformer-rope-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code12.54
85Mms-1b-allMay 2023Scaling Speech Technology to 1,000+ Languages · code12.63
86Hubert-large-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code12.75
87Data2vec-audio-large-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code12.94
88Wav2vec2-large-robust-ft-libri-960hApr 2021Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code13.76
89Whisper-tiny.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code15.45
90wav2vec 2.0 Large (960h)OpenMeta AIJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code15.46
91Data2vec-audio-base-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code15.48
92Mms-1b-fl102May 2023Scaling Speech Technology to 1,000+ Languages · code28.70
wer-test-clean· primary
10 rows
#ModelOrgSubmittedPaper / codewer-test-clean
01Universal-1AssemblyAIApr 2024official1.60
02Parakeet-CTC-1.1BOpenNVIDIA / SunoNov 2023Parakeet: Efficient, Accurate Speech Recognition Adapted…1.70
03Conformer-CTC LargeOpenNVIDIA / NeMoJan 2023VALL-E: Neural Codec Language Models are Zero-Shot Text …1.70
04Canary-1BOpenNVIDIAOct 2023Canary: A Multilingual Speech Recognition Model1.70
05Whisper Large v3OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi…1.80
06wav2vec 2.0 Large (960h)OpenMeta AIJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of…1.80
07HuBERT Large (LS-960)OpenMeta AIJun 2021HuBERT: Self-Supervised Speech Representation Learning b…1.90
08Google USMGoogleMar 2023Google USM: Scaling Automatic Speech Recognition Beyond …2.00
09Whisper Large v2OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi…2.70
10Pulse STTSmallest AIMay 2026official3.22
wer-test-other
9 rows
#ModelOrgSubmittedPaper / codewer-test-other
01Universal-1AssemblyAIApr 2024official3.10
02wav2vec 2.0 Large (960h)OpenMeta AIJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of…3.30
03Whisper Large v3OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi…3.60
04HuBERT Large (LS-960)OpenMeta AIJun 2021HuBERT: Self-Supervised Speech Representation Learning b…3.60
05Canary-1BOpenNVIDIAOct 2023Canary: A Multilingual Speech Recognition Model3.80
06Google USMGoogleMar 2023Google USM: Scaling Automatic Speech Recognition Beyond …4.10
07Parakeet-CTC-1.1BOpenNVIDIA / SunoNov 2023Parakeet: Efficient, Accurate Speech Recognition Adapted…4.20
08Whisper Large v2OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi…5.20
09Pulse STTSmallest AIMay 2026official5.83
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

3 steps
of state of the art.

Each row below marks a model that broke the previous record on wer-test-clean. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Lower scores win. Each subsequent entry improved upon the previous best.

SOTA line · wer-test-clean
  1. Jun 20, 2020wav2vec 2.0 Large (960h)Meta AI1.80
  2. Jan 5, 2023Conformer-CTC LargeNVIDIA / NeMo1.70
  3. Apr 3, 2024Universal-1AssemblyAI1.60
Fig 3 · SOTA-setting models only. 3 entries span Jun 2020 Apr 2024.
§ 04 · Literature

34 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies