Codesota · Speech · Speech Recognition · Open ASR LeaderboardTasks/Speech/Speech Recognition
Speech Recognition · benchmark dataset · 2023 · EN

HF Open ASR Leaderboard (aggregate).

The Hugging Face Open ASR Leaderboard aggregates Word Error Rate and real-time factor across LibriSpeech, AMI, Earnings-22, GigaSpeech, SPGISpeech, TED-LIUM, and VoxPopuli to give a single composite score for English ASR systems. The de-facto modern ASR leaderboard.

Paper Submit a result
§ 01 · Leaderboard

Best published scores.

102 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
wer · lower is better
All metrics
rtfx, wer
rtfx
50 rows
#ModelOrgSubmittedPaper / codertfx
01Stt_en_fastconformer_ctc_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…6399.25
02Stt_en_conformer_ctc_smallMay 2020Conformer: Convolution-augmented Transformer for Speech … · code5686.90
03Parakeet-tdt_ctc-110mApr 2023Efficient Sequence Transduction by Jointly Predicting To… · code5345.14
04Stt_en_conformer_ctc_largeMay 2020Conformer: Convolution-augmented Transformer for Speech … · code4295.01
05Parakeet-ctc-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…4281.53
06Stt_en_fastconformer_transducer_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…4097.43
07Parakeet-tdt-0.6b-v2Apr 2023Efficient Sequence Transduction by Jointly Predicting To… · code3386.02
08Parakeet-rnnt-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…2815.72
09Moonshine-streaming-tinyJan 2026pwc-dump847.20
10Moonshine-tinyOct 2024Moonshine: Speech Recognition for Live Transcription and… · code753.06
11Wav2vec2-base-960hJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code686.00
12Data2vec-audio-base-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code648.14
13Wav2vec2-conformer-rope-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code607.87
14Moonshine-streaming-smallJan 2026pwc-dump566.33
15Moonshine-baseOct 2024Moonshine: Speech Recognition for Live Transcription and… · code565.97
16Cohere Transcribe (Mar 2026)OpenCohereMar 2026pwc-dump524.88
17Wav2vec2-conformer-rel-pos-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code522.46
18wav2vec 2.0 Large (960h)OpenMeta AIJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code516.58
19Wav2vec2-large-960h-lv60-selfJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code509.32
20Wav2vec2-large-robust-ft-libri-960hApr 2021Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code503.81
21Owsm_ctc_v3.1_1BJan 2024OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code502.02
22Hubert-large-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code495.86
23Data2vec-audio-large-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code470.15
24Asr-wav2vec2-librispeechJun 2021SpeechBrain: A General-Purpose Speech Toolkit · code451.18
25Moonshine Streaming MediumOpenUseful SensorsJan 2026pwc-dump448.15
26Hubert-xlarge-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code361.32
27Whisper-tiny.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code348.12
28Distil-small.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code331.89
29Whisper-base.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code320.67
30Distil-medium.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code279.73
31Granite Speech 3.3 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…270.57
32Whisper-small.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code268.91
33Mms-1b-fl102May 2023Scaling Speech Technology to 1,000+ Languages · code234.42
34Granite Speech 4.1 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…231.29
35Mms-1b-allMay 2023Scaling Speech Technology to 1,000+ Languages · code230.79
36Distil-large-v3Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code214.42
37Distil-large-v2Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code202.95
38Whisper Large v3 TurboOpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code200.19
39Lite-whisper-large-v3-turbo-accFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code191.71
40Whisper-medium.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code182.13
41Phi-4 Multimodal InstructOpenMicrosoftMar 2025Phi-4-Mini Technical Report: Compact yet Powerful Multim…151.10
42Qwen3-ASR-1.7BOpenAlibabaJan 2026Qwen3-ASR Technical Report · code147.93
43Whisper Large v3OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code145.51
44Whisper Large v2OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code144.45
45Whisper LargeDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code143.76
46Lite-whisper-large-v3-fastFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code120.76
47Voxtral-Mini-4B-Realtime-2602OpenMistral AIFeb 2026Voxtral Realtime93.32
48SYMPHONY-ASRJan 2026pwc-dump77.56
49VibeVoice-ASR-HFJan 2026VIBEVOICE-ASR Technical Report51.80
50Asr-conformer-loquaciousFeb 2025pwc-dump42.16
wer· primary
52 rows
#ModelOrgSubmittedPaper / codewer
01Granite Speech 4.1 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…5.33
02Cohere Transcribe (Mar 2026)OpenCohereMar 2026pwc-dump5.42
03Qwen3-ASR-1.7BOpenAlibabaJan 2026Qwen3-ASR Technical Report · code5.76
04SYMPHONY-ASRJan 2026pwc-dump5.91
05Granite Speech 3.3 2BOpenIBMMay 2025Granite-speech: open-source speech-aware LLMs with stron…6.00
06Phi-4 Multimodal InstructOpenMicrosoftMar 2025Phi-4-Mini Technical Report: Compact yet Powerful Multim…6.02
07Parakeet-tdt-0.6b-v2Apr 2023Efficient Sequence Transduction by Jointly Predicting To… · code6.05
08Moonshine Streaming MediumOpenUseful SensorsJan 2026pwc-dump6.66
09Whisper Large v3OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code7.44
10Parakeet-tdt_ctc-110mApr 2023Efficient Sequence Transduction by Jointly Predicting To… · code7.49
11Parakeet-rnnt-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…7.50
12Distil-large-v3Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code7.52
13Voxtral-Mini-4B-Realtime-2602OpenMistral AIFeb 2026Voxtral Realtime7.68
14Parakeet-ctc-0.6bMay 2023Fast Conformer with Linearly Scalable Attention for Effi…7.69
15Lite-whisper-large-v3-turbo-accFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code7.77
16VibeVoice-ASR-HFJan 2026VIBEVOICE-ASR Technical Report7.77
17Whisper Large v3 TurboOpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code7.83
18Whisper Large v2OpenOpenAIDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code7.83
19Moonshine-streaming-smallJan 2026pwc-dump7.84
20Distil-large-v2Nov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code7.92
21Whisper LargeDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code7.94
22Whisper-medium.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code8.09
23Owsm_ctc_v3.1_1BJan 2024OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code8.12
24Lite-whisper-large-v3-fastFeb 2025LiteASR: Efficient Automatic Speech Recognition with Low… · code8.16
25Stt_en_conformer_ctc_largeMay 2020Conformer: Convolution-augmented Transformer for Speech … · code8.32
26Asr-conformer-loquaciousFeb 2025pwc-dump8.48
27Distil-small.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code8.57
28Whisper-small.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code8.59
29Distil-medium.enNov 2023Distil-Whisper: Robust Knowledge Distillation via Large-… · code8.77
30Niagara-38m-batch.enFeb 2026pwc-dump8.91
31Stt_en_fastconformer_ctc_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…8.96
32Stt_en_fastconformer_transducer_largeMay 2023Fast Conformer with Linearly Scalable Attention for Effi…9.06
33Moonshine-baseOct 2024Moonshine: Speech Recognition for Live Transcription and… · code9.99
34Whisper-base.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code10.32
35Niagara-19m-batch.enFeb 2026pwc-dump10.47
36Stt_en_conformer_ctc_smallMay 2020Conformer: Convolution-augmented Transformer for Speech … · code11.16
37Moonshine-streaming-tinyJan 2026pwc-dump12
38Moonshine-tinyOct 2024Moonshine: Speech Recognition for Live Transcription and… · code12.65
39Whisper-tiny.enDec 2022Robust Speech Recognition via Large-Scale Weak Supervisi… · code12.81
40Asr-wav2vec2-librispeechJun 2021SpeechBrain: A General-Purpose Speech Toolkit · code14.35
41Wav2vec2-large-960h-lv60-selfJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code21.27
42Mms-1b-allMay 2023Scaling Speech Technology to 1,000+ Languages · code22.54
43Hubert-xlarge-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code22.55
44Hubert-large-ls960-ftJun 2021HuBERT: Self-Supervised Speech Representation Learning b… · code22.69
45Wav2vec2-large-robust-ft-libri-960hApr 2021Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code22.93
46Data2vec-audio-large-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code23.21
47Wav2vec2-conformer-rope-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code23.28
48Wav2vec2-conformer-rel-pos-large-960h-ftOct 2020fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code23.29
49wav2vec 2.0 Large (960h)OpenMeta AIJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code26.77
50Data2vec-audio-base-960hFeb 2022data2vec: A General Framework for Self-supervised Learni… · code28.30
51Wav2vec2-base-960hJun 2020wav2vec 2.0: A Framework for Self-Supervised Learning of… · code29.40
52Mms-1b-fl102May 2023Scaling Speech Technology to 1,000+ Languages · code39.80
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

5 steps
of state of the art.

Each row below marks a model that broke the previous record on wer. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Lower scores win. Each subsequent entry improved upon the previous best.

SOTA line · wer
  1. May 16, 2020Stt_en_conformer_ctc_large8.32
  2. Dec 6, 2022Whisper Large v3OpenAI7.44
  3. Apr 13, 2023Parakeet-tdt-0.6b-v26.05
  4. Mar 3, 2025Phi-4 Multimodal InstructMicrosoft6.02
  5. May 13, 2025Granite Speech 4.1 2BIBM5.33
Fig 3 · SOTA-setting models only. 5 entries span May 2020 May 2025.
§ 04 · Literature

20 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies