Audio

Research on processing, understanding, and generating audio signals, including speech recognition, music generation, sound classification, and audio synthesis.

5 tasks52 datasets19 results

Tasks & Benchmarks

Show all datasets and SOTA results

Text-to-speech

LJ Speech2017
4.61(mos)VALL-E 2
VCTK2019
4.36(mos)NaturalSpeech 3

Audio Classification

AudioSet2017
0.48(map)AST (Ensemble-M)
ESC-502015
98.1(accuracy)BEATs (iter3+)

Voice cloning

LibriTTS test-clean (Zero-Shot TTS)2019
5.9(wer)VALL-E

Automatic Speech Recognition

CoVost2 (en→zh)
CosyVoice3 Cross-Lingual Test Set zh to en
MiniMax Multilingual Test Set - Chinese
Open ASR Leaderboard
SEED Seed-TTS test-zh
VoiceBench Overall

Get notified when these results update

New models drop weekly. We track them so you don't have to.

Audio Benchmarks - CodeSOTA | CodeSOTA