Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Audio ClassificationHome/Tasks/Audio/Audio Classification
Audio· audio-classification

Audio Classification.

Classification of audio signals into predefined categories such as music genres, environmental sounds, or speaker identification.

6
Datasets
5
Results
map
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

AudioSet

2M+ human-labeled 10-second YouTube video clips covering 632 audio event classes.

Primary metric: map
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on AudioSet.

#ModelmapYearSource
AST (Ensemble-M)0.4852021paper ↗

What were you looking for on Audio Classification?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

6 datasets tracked for this task.

AudioSet
CANONICAL
1 result · map
Top: AST (Ensemble-M) 0.485
ESC-50
4 results · accuracy
Top: BEATs (iter3+) 98.1
ESC-50
0 results
GTZAN Genre
0 results
Speech Command V2
0 results
VocalSound
0 results
§ 05 · Related tasks

Other tasks in Audio.

Audio-Language ModelsAutomatic Speech RecognitionText-to-speechVoice cloning
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Audio Classification? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.