| 01 | Stt_en_fastconformer_ctc_large | — | May 2023 | Fast Conformer with Linearly Scalable Attention for Effi… | 6399.25 |
| 02 | Stt_en_conformer_ctc_small | — | May 2020 | Conformer: Convolution-augmented Transformer for Speech … · code | 5686.90 |
| 03 | Parakeet-tdt_ctc-110m | — | Apr 2023 | Efficient Sequence Transduction by Jointly Predicting To… · code | 5345.14 |
| 04 | Stt_en_conformer_ctc_large | — | May 2020 | Conformer: Convolution-augmented Transformer for Speech … · code | 4295.01 |
| 05 | Parakeet-ctc-0.6b | — | May 2023 | Fast Conformer with Linearly Scalable Attention for Effi… | 4281.53 |
| 06 | Stt_en_fastconformer_transducer_large | — | May 2023 | Fast Conformer with Linearly Scalable Attention for Effi… | 4097.43 |
| 07 | Parakeet-tdt-0.6b-v2 | — | Apr 2023 | Efficient Sequence Transduction by Jointly Predicting To… · code | 3386.02 |
| 08 | Parakeet-rnnt-0.6b | — | May 2023 | Fast Conformer with Linearly Scalable Attention for Effi… | 2815.72 |
| 09 | Moonshine-streaming-tiny | — | Jan 2026 | pwc-dump | 847.20 |
| 10 | Moonshine-tiny | — | Oct 2024 | Moonshine: Speech Recognition for Live Transcription and… · code | 753.06 |
| 11 | Wav2vec2-base-960h | — | Jun 2020 | wav2vec 2.0: A Framework for Self-Supervised Learning of… · code | 686.00 |
| 12 | Data2vec-audio-base-960h | — | Feb 2022 | data2vec: A General Framework for Self-supervised Learni… · code | 648.14 |
| 13 | Wav2vec2-conformer-rope-large-960h-ft | — | Oct 2020 | fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code | 607.87 |
| 14 | Moonshine-streaming-small | — | Jan 2026 | pwc-dump | 566.33 |
| 15 | Moonshine-base | — | Oct 2024 | Moonshine: Speech Recognition for Live Transcription and… · code | 565.97 |
| 16 | Cohere Transcribe (Mar 2026)Open | Cohere | Mar 2026 | pwc-dump | 524.88 |
| 17 | Wav2vec2-conformer-rel-pos-large-960h-ft | — | Oct 2020 | fairseq S2T: Fast Speech-to-Text Modeling with fairseq · code | 522.46 |
| 18 | wav2vec 2.0 Large (960h)Open | Meta AI | Jun 2020 | wav2vec 2.0: A Framework for Self-Supervised Learning of… · code | 516.58 |
| 19 | Wav2vec2-large-960h-lv60-self | — | Jun 2020 | wav2vec 2.0: A Framework for Self-Supervised Learning of… · code | 509.32 |
| 20 | Wav2vec2-large-robust-ft-libri-960h | — | Apr 2021 | Robust wav2vec 2.0: Analyzing Domain Shift in Self-Super… · code | 503.81 |
| 21 | Owsm_ctc_v3.1_1B | — | Jan 2024 | OWSM v3.1: Better and Faster Open Whisper-Style Speech M… · code | 502.02 |
| 22 | Hubert-large-ls960-ft | — | Jun 2021 | HuBERT: Self-Supervised Speech Representation Learning b… · code | 495.86 |
| 23 | Data2vec-audio-large-960h | — | Feb 2022 | data2vec: A General Framework for Self-supervised Learni… · code | 470.15 |
| 24 | Asr-wav2vec2-librispeech | — | Jun 2021 | SpeechBrain: A General-Purpose Speech Toolkit · code | 451.18 |
| 25 | Moonshine Streaming MediumOpen | Useful Sensors | Jan 2026 | pwc-dump | 448.15 |
| 26 | Hubert-xlarge-ls960-ft | — | Jun 2021 | HuBERT: Self-Supervised Speech Representation Learning b… · code | 361.32 |
| 27 | Whisper-tiny.en | — | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 348.12 |
| 28 | Distil-small.en | — | Nov 2023 | Distil-Whisper: Robust Knowledge Distillation via Large-… · code | 331.89 |
| 29 | Whisper-base.en | — | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 320.67 |
| 30 | Distil-medium.en | — | Nov 2023 | Distil-Whisper: Robust Knowledge Distillation via Large-… · code | 279.73 |
| 31 | Granite Speech 3.3 2BOpen | IBM | May 2025 | Granite-speech: open-source speech-aware LLMs with stron… | 270.57 |
| 32 | Whisper-small.en | — | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 268.91 |
| 33 | Mms-1b-fl102 | — | May 2023 | Scaling Speech Technology to 1,000+ Languages · code | 234.42 |
| 34 | Granite Speech 4.1 2BOpen | IBM | May 2025 | Granite-speech: open-source speech-aware LLMs with stron… | 231.29 |
| 35 | Mms-1b-all | — | May 2023 | Scaling Speech Technology to 1,000+ Languages · code | 230.79 |
| 36 | Distil-large-v3 | — | Nov 2023 | Distil-Whisper: Robust Knowledge Distillation via Large-… · code | 214.42 |
| 37 | Distil-large-v2 | — | Nov 2023 | Distil-Whisper: Robust Knowledge Distillation via Large-… · code | 202.95 |
| 38 | Whisper Large v3 TurboOpen | OpenAI | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 200.19 |
| 39 | Lite-whisper-large-v3-turbo-acc | — | Feb 2025 | LiteASR: Efficient Automatic Speech Recognition with Low… · code | 191.71 |
| 40 | Whisper-medium.en | — | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 182.13 |
| 41 | Phi-4 Multimodal InstructOpen | Microsoft | Mar 2025 | Phi-4-Mini Technical Report: Compact yet Powerful Multim… | 151.10 |
| 42 | Qwen3-ASR-1.7BOpen | Alibaba | Jan 2026 | Qwen3-ASR Technical Report · code | 147.93 |
| 43 | Whisper Large v3Open | OpenAI | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 145.51 |
| 44 | Whisper Large v2Open | OpenAI | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 144.45 |
| 45 | Whisper Large | — | Dec 2022 | Robust Speech Recognition via Large-Scale Weak Supervisi… · code | 143.76 |
| 46 | Lite-whisper-large-v3-fast | — | Feb 2025 | LiteASR: Efficient Automatic Speech Recognition with Low… · code | 120.76 |
| 47 | Voxtral-Mini-4B-Realtime-2602Open | Mistral AI | Feb 2026 | Voxtral Realtime | 93.32 |
| 48 | SYMPHONY-ASR | — | Jan 2026 | pwc-dump | 77.56 |
| 49 | VibeVoice-ASR-HF | — | Jan 2026 | VIBEVOICE-ASR Technical Report | 51.80 |
| 50 | Asr-conformer-loquacious | — | Feb 2025 | pwc-dump | 42.16 |