Codesota · Benchmark · MusicCapsHome/Leaderboards/Audio & Speech/Music Generation/MusicCaps
Unknown

MusicCaps.

Music generation evaluated on 5.5K expert-annotated music clips

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Fad

Fad is the reported evaluation metric for MusicCaps. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Fadverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01MusicGen-Medium
MusicGen-Medium (Copet et al., Meta AI, NeurIPS 2023). FAD on MusicCaps. Reproduced result in AudioLDM 2 Table III.
verified4.892023Source ↗Looks wrong?
02AudioLDM 2-MSD
AudioLDM 2-MSD (MagnaTagATune/Million Song Dataset variant). FAD on MusicCaps. Table III in paper.
verified4.472024Source ↗Looks wrong?
03MusicLM
MusicLM (Agostinelli et al., Google, 2023). FAD on MusicCaps. Reported in AudioLDM 2 Table III (not reproduced).
verified4.002023Source ↗Looks wrong?
04MusicGen Large
MusicGen 3.3B, MusicCaps. Table 3, FAD_vgg.
paper3.802026Source ↗Looks wrong?
05AudioLDM-M
AudioLDM medium (Liu et al., ICML 2023). FAD on MusicCaps. Reproduced in AudioLDM 2 Table III.
verified3.202023Source ↗Looks wrong?
06AudioLDM 2-Full
AudioLDM 2-Full (Liu et al., IEEE/ACM TASLP 2024). Best FAD on MusicCaps evaluation set. Table III in paper.
verified3.132024Source ↗Looks wrong?
07Noise2Music
Noise2Music waveform model, MusicCaps FAD_vgg. Table 3.
paper2.132026Source ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Music Generation
MusicCaps Leaderboard | CodeSOTA | CodeSOTA