Speech Recognition2019multilingual

Mozilla Common Voice

Massive multilingual dataset of transcribed speech. Covers diverse demographics and accents. Over 100 languages, updated continuously by Mozilla Foundation.

Current State of the Art

Whisper Large V3

OpenAI

8.4

wer

Common Voice — wer

3 results · 2 SOTA advances · lower is better

All results
SOTA frontier
8910111220202021202220232024werwav2vec 2.0 Large (960h)Whisper Large V3

wer Progress Over Time

Showing 2 breakthroughs from Jun 2020 to Feb 2025

8.28.89.510.110.7Jun 2020Feb 2025werDate

Key Milestones

Jun 2020
wav2vec 2.0 Large (960h)

WER (%) on Common Voice 9 English. Source: Papers With Code / wav2vec2 model card

10.5
Feb 2025
Whisper Large V3Current SOTA

WER (%) on Common Voice 15 English test set. Source: B-Whisper paper Table 1 baseline, arxiv:2502.11572

8.4
-20.0%
Total Improvement
20.0%
Time Span
4y 9m
Breakthroughs
2
Current SOTA
8.4

Top Models Performance Comparison

Top 3 models ranked by wer (lower is better)

wer1Whisper Large V38.4100.0%2wav2vec 2.0 Large (960h)10.580.0%3Whisper Large-v211.275.0%0%25%50%75%100%% of best
Best Score
8.4
Top Model
Whisper Large V3
Models Compared
3
Score Range
2.8

werPrimary

#ModelScorePaper / CodeDate
1
Whisper Large V3Open Source
OpenAI
8.4Dec 2022
2
wav2vec 2.0 Large (960h)Open Source
Meta AI
10.5Jun 2020
3
Whisper Large-v2Open Source
OpenAI
11.2Dec 2022

Related Papers2

Other Speech Recognition Datasets