Who leads the Open ASR Leaderboard benchmark?

Granite Speech 4.1 2B currently leads Open ASR Leaderboard with a score of 5.33 on wer (lower is better).

What is the state-of-the-art score on Open ASR Leaderboard?

The state-of-the-art result on Open ASR Leaderboard is 5.33 (wer), achieved by Granite Speech 4.1 2B as of 2026.

How many models are tracked on Open ASR Leaderboard?

Codesota tracks 52 models on Open ASR Leaderboard across 2 metrics.

When was the Open ASR Leaderboard leaderboard last updated?

The Open ASR Leaderboard leaderboard on Codesota includes results through 2026, with the earliest tracked result from 2020.

Codesota · Speech · Speech Recognition · Open ASR LeaderboardTasks/Speech/Speech Recognition

Speech Recognition · benchmark dataset · 2023 · EN

HF Open ASR Leaderboard (aggregate).

Name: HF Open ASR Leaderboard (aggregate) Benchmark Results
Creator: Codesota
Published: 2020-01-01
License: https://creativecommons.org/licenses/by/4.0/

The Hugging Face Open ASR Leaderboard aggregates Word Error Rate and real-time factor across LibriSpeech, AMI, Earnings-22, GigaSpeech, SPGISpeech, TED-LIUM, and VoxPopuli to give a single composite score for English ASR systems. The de-facto modern ASR leaderboard.

Paper ↗Submit a result ↵

§ 01 · Leaderboard

Best published scores.

102 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.

Primary: wer · lower is better
All metrics: rtfx, wer

rtfx

50 rows

#	Model	Org	Submitted	Paper / code	rtfx
01	Stt_en_fastconformer_ctc_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	6399.25
02	Stt_en_conformer_ctc_small	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	5686.90
03	Parakeet-tdt_ctc-110m	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	5345.14
04	Stt_en_conformer_ctc_large	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	4295.01
05	Parakeet-ctc-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	4281.53
06	Stt_en_fastconformer_transducer_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	4097.43
07	Parakeet-tdt-0.6b-v2	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	3386.02
08	Parakeet-rnnt-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	2815.72
09	Moonshine-streaming-tiny	—	Jan 2026	pwc-dump	847.20
10	Moonshine-tiny	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	753.06
11	Wav2vec2-base-960h	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	686.00
12	Data2vec-audio-base-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	648.14
13	Wav2vec2-conformer-rope-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	607.87
14	Moonshine-streaming-small	—	Jan 2026	pwc-dump	566.33
15	Moonshine-base	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	565.97
16	Cohere Transcribe (Mar 2026)Open	Cohere	Mar 2026	pwc-dump	524.88
17	Wav2vec2-conformer-rel-pos-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	522.46
18	wav2vec 2.0 Large (960h)Open	Meta AI	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	516.58
19	Wav2vec2-large-960h-lv60-self	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	509.32
20	Wav2vec2-large-robust-ft-libri-960h	—	Apr 2021	Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code	503.81
21	Owsm_ctc_v3.1_1B	—	Jan 2024	OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code	502.02
22	Hubert-large-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	495.86
23	Data2vec-audio-large-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	470.15
24	Asr-wav2vec2-librispeech	—	Jun 2021	SpeechBrain: A General-Purpose Speech Toolkit · code	451.18
25	Moonshine Streaming MediumOpen	Useful Sensors	Jan 2026	pwc-dump	448.15
26	Hubert-xlarge-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	361.32
27	Whisper-tiny.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	348.12
28	Distil-small.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	331.89
29	Whisper-base.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	320.67
30	Distil-medium.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	279.73
31	Granite Speech 3.3 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	270.57
32	Whisper-small.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	268.91
33	Mms-1b-fl102	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	234.42
34	Granite Speech 4.1 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	231.29
35	Mms-1b-all	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	230.79
36	Distil-large-v3	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	214.42
37	Distil-large-v2	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	202.95
38	Whisper Large v3 TurboOpen	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	200.19
39	Lite-whisper-large-v3-turbo-acc	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	191.71
40	Whisper-medium.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	182.13
41	Phi-4 Multimodal InstructOpen	Microsoft	Mar 2025	Phi-4-Mini Technical Report: Compact yet Powerful Multim…	151.10
42	Qwen3-ASR-1.7BOpen	Alibaba	Jan 2026	Qwen3-ASR Technical Report · code	147.93
43	Whisper Large v3Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	145.51
44	Whisper Large v2Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	144.45
45	Whisper Large	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	143.76
46	Lite-whisper-large-v3-fast	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	120.76
47	Voxtral-Mini-4B-Realtime-2602Open	Mistral AI	Feb 2026	Voxtral Realtime	93.32
48	SYMPHONY-ASR	—	Jan 2026	pwc-dump	77.56
49	VibeVoice-ASR-HF	—	Jan 2026	VIBEVOICE-ASR Technical Report	51.80
50	Asr-conformer-loquacious	—	Feb 2025	pwc-dump	42.16

wer· primary

52 rows

#	Model	Org	Submitted	Paper / code	wer
01	Granite Speech 4.1 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	5.33
02	Cohere Transcribe (Mar 2026)Open	Cohere	Mar 2026	pwc-dump	5.42
03	Qwen3-ASR-1.7BOpen	Alibaba	Jan 2026	Qwen3-ASR Technical Report · code	5.76
04	SYMPHONY-ASR	—	Jan 2026	pwc-dump	5.91
05	Granite Speech 3.3 2BOpen	IBM	May 2025	Granite-speech: open-source speech-aware LLMs with stron…	6.00
06	Phi-4 Multimodal InstructOpen	Microsoft	Mar 2025	Phi-4-Mini Technical Report: Compact yet Powerful Multim…	6.02
07	Parakeet-tdt-0.6b-v2	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	6.05
08	Moonshine Streaming MediumOpen	Useful Sensors	Jan 2026	pwc-dump	6.66
09	Whisper Large v3Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	7.44
10	Parakeet-tdt_ctc-110m	—	Apr 2023	Efficient Sequence Transduction by Jointly Predicting To… · code	7.49
11	Parakeet-rnnt-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	7.50
12	Distil-large-v3	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	7.52
13	Voxtral-Mini-4B-Realtime-2602Open	Mistral AI	Feb 2026	Voxtral Realtime	7.68
14	Parakeet-ctc-0.6b	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	7.69
15	Lite-whisper-large-v3-turbo-acc	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	7.77
16	VibeVoice-ASR-HF	—	Jan 2026	VIBEVOICE-ASR Technical Report	7.77
17	Whisper Large v3 TurboOpen	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	7.83
18	Whisper Large v2Open	OpenAI	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	7.83
19	Moonshine-streaming-small	—	Jan 2026	pwc-dump	7.84
20	Distil-large-v2	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	7.92
21	Whisper Large	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	7.94
22	Whisper-medium.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	8.09
23	Owsm_ctc_v3.1_1B	—	Jan 2024	OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code	8.12
24	Lite-whisper-large-v3-fast	—	Feb 2025	LiteASR: Efficient Automatic Speech Recognition with Low… · code	8.16
25	Stt_en_conformer_ctc_large	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	8.32
26	Asr-conformer-loquacious	—	Feb 2025	pwc-dump	8.48
27	Distil-small.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	8.57
28	Whisper-small.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	8.59
29	Distil-medium.en	—	Nov 2023	Distil-Whisper: Robust Knowledge Distillation via Large-… · code	8.77
30	Niagara-38m-batch.en	—	Feb 2026	pwc-dump	8.91
31	Stt_en_fastconformer_ctc_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	8.96
32	Stt_en_fastconformer_transducer_large	—	May 2023	Fast Conformer with Linearly Scalable Attention for Effi…	9.06
33	Moonshine-base	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	9.99
34	Whisper-base.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	10.32
35	Niagara-19m-batch.en	—	Feb 2026	pwc-dump	10.47
36	Stt_en_conformer_ctc_small	—	May 2020	Conformer: Convolution-augmented Transformer for Speech … · code	11.16
37	Moonshine-streaming-tiny	—	Jan 2026	pwc-dump	12
38	Moonshine-tiny	—	Oct 2024	Moonshine: Speech Recognition for Live Transcription and… · code	12.65
39	Whisper-tiny.en	—	Dec 2022	Robust Speech Recognition via Large-Scale Weak Supervisi… · code	12.81
40	Asr-wav2vec2-librispeech	—	Jun 2021	SpeechBrain: A General-Purpose Speech Toolkit · code	14.35
41	Wav2vec2-large-960h-lv60-self	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	21.27
42	Mms-1b-all	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	22.54
43	Hubert-xlarge-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	22.55
44	Hubert-large-ls960-ft	—	Jun 2021	HuBERT: Self-Supervised Speech Representation Learning b… · code	22.69
45	Wav2vec2-large-robust-ft-libri-960h	—	Apr 2021	Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code	22.93
46	Data2vec-audio-large-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	23.21
47	Wav2vec2-conformer-rope-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	23.28
48	Wav2vec2-conformer-rel-pos-large-960h-ft	—	Oct 2020	fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code	23.29
49	wav2vec 2.0 Large (960h)Open	Meta AI	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	26.77
50	Data2vec-audio-base-960h	—	Feb 2022	data2vec: A General Framework for Self-supervised Learni… · code	28.30
51	Wav2vec2-base-960h	—	Jun 2020	wav2vec 2.0: A Framework for Self-Supervised Learning of… · code	29.40
52	Mms-1b-fl102	—	May 2023	Scaling Speech Technology to 1,000+ Languages · code	39.80

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 03 · Progress

5 steps
of state of the art.

Each row below marks a model that broke the previous record on wer. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Lower scores win. Each subsequent entry improved upon the previous best.

SOTA line · wer

May 16, 2020Stt_en_conformer_ctc_large8.32
Dec 6, 2022Whisper Large v3OpenAI7.44
Apr 13, 2023Parakeet-tdt-0.6b-v26.05
Mar 3, 2025Phi-4 Multimodal InstructMicrosoft6.02
May 13, 2025Granite Speech 4.1 2BIBM5.33

Fig 3 · SOTA-setting models only. 5 entries span May 2020 → May 2025.

§ 04 · Literature

20 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

Voxtral Realtime
Feb 2026·Voxtral-Mini-4B-Realtime-2602
arXiv ↗
Qwen3-ASR Technical Report
Jan 2026·Qwen3-ASR-1.7B
arXiv ↗Code
VIBEVOICE-ASR Technical Report
Jan 2026·VibeVoice-ASR-HF
arXiv ↗
Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities
May 2025·Granite Speech 3.3 2B, Granite Speech 4.1 2B
arXiv ↗
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Mar 2025·Phi-4 Multimodal Instruct
arXiv ↗
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Feb 2025·Lite-whisper-large-v3-turbo-acc, Lite-whisper-large-v3-fast
arXiv ↗Code
Moonshine: Speech Recognition for Live Transcription and Voice Commands
Oct 2024·Moonshine-tiny, Moonshine-base
arXiv ↗Code
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Jan 2024·Owsm_ctc_v3.1_1B
arXiv ↗Code
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Nov 2023·Distil-small.en, Distil-medium.en, Distil-large-v3 +1
arXiv ↗Code
Scaling Speech Technology to 1,000+ Languages
May 2023·Mms-1b-fl102, Mms-1b-all
arXiv ↗Code
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
May 2023·Stt_en_fastconformer_ctc_large, Parakeet-ctc-0.6b, Stt_en_fastconformer_transducer_large +1
arXiv ↗
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Apr 2023·Parakeet-tdt_ctc-110m, Parakeet-tdt-0.6b-v2
arXiv ↗Code
Robust Speech Recognition via Large-Scale Weak Supervision
Dec 2022·Whisper-tiny.en, Whisper-base.en, Whisper-small.en +5
arXiv ↗Code
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Feb 2022·Data2vec-audio-base-960h, Data2vec-audio-large-960h
arXiv ↗Code
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Jun 2021·Hubert-large-ls960-ft, Hubert-xlarge-ls960-ft
arXiv ↗Code
SpeechBrain: A General-Purpose Speech Toolkit
Jun 2021·Asr-wav2vec2-librispeech
arXiv ↗Code
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
Apr 2021·Wav2vec2-large-robust-ft-libri-960h
arXiv ↗Code
fairseq S2T: Fast Speech-to-Text Modeling with fairseq
Oct 2020·Wav2vec2-conformer-rope-large-960h-ft, Wav2vec2-conformer-rel-pos-large-960h-ft
arXiv ↗Code
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Jun 2020·Wav2vec2-base-960h, wav2vec 2.0 Large (960h), Wav2vec2-large-960h-lv60-self
arXiv ↗Code
Conformer: Convolution-augmented Transformer for Speech Recognition
May 2020·Stt_en_conformer_ctc_small, Stt_en_conformer_ctc_large
arXiv ↗Code

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

HF Open ASR Leaderboard (aggregate).

Best published scores.

5 stepsof state of the art.

20 paperstied to this benchmark.

Neighbouring benchmarks.

Have a score that beatsthis table?

5 steps
of state of the art.

20 papers
tied to this benchmark.

Have a score that beats
this table?