Model card
Voicebox.
Meta AIproprietary330M paramsFlow matching (non-autoregressive)
Le et al. arXiv 2306.15687.
§ 02 · Benchmarks
Every benchmark Voicebox has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | LibriTTS test-clean (Zero-Shot TTS) | Audio · Voice cloning | wer | 1.9% | #2 | — | source ↗ |
| 02 | LJ Speech | Audio · Text-to-speech | mos | 4.3% | #4 | 2023-06-27 | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 04 · Papers
1 paper with results for Voicebox.
- 2023-06-27· Speech· 1 result
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
§ 05 · Related models
Other Meta AI models scored on Codesota.
GENRE
1 result · 1 SOTA
SeamlessM4T v2 Large
2.3B params · 1 result · 1 SOTA
wav2vec 2.0 Large (960h)
317M params · 3 results
HuBERT Large (LS-960)
317M params · 2 results
DINOv2 (ViT-g) + Linear
Unknown params · 1 result
Fairseq S2T (MuST-C)
~150M params · 1 result
Mask2Former (Swin-L)
Unknown params · 1 result
MusicGen Large
3.3B params · 1 result
§ 06 · Sources & freshness
Where these numbers come from.
editorial
1
result
arxiv
1
result
1 of 2 rows marked verified.