VoxPopuli is a large-scale multilingual speech corpus derived from European Parliament event recordings, providing labelled ASR data for 18 European languages plus large quantities of unlabelled audio for self-supervised pre-training.
Wer is the reported evaluation metric for VoxPopuli. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Lower is better