Voicebox.

Meta AIproprietary330M paramsFlow matching (non-autoregressive)

Le et al. arXiv 2306.15687.

§ 02 · Benchmarks

Every benchmark Voicebox has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	LibriTTS test-clean (Zero-Shot TTS)	Audio · Voice cloning	wer	1.9%	#2/3	—	source ↗
02	LJ Speech	Audio · Text-to-speech	mos	4.3%	#4/5	2023-06-27	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where Voicebox actually performs.

Audio

benchmarks

avg rank #3.0

§ 04 · Papers

1 paper with results for Voicebox.

2023-06-27· Speech· 1 result
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

§ 05 · Related models

Other Meta AI models scored on Codesota.

GENRE

1 result · 1 SOTA

SeamlessM4T v2 Large

2.3B params · 1 result · 1 SOTA

wav2vec 2.0 Large (960h)

317M params · 3 results

HuBERT Large (LS-960)

317M params · 2 results

DINOv2 (ViT-g) + Linear

Unknown params · 1 result

Fairseq S2T (MuST-C)

~150M params · 1 result

Mask2Former (Swin-L)

Unknown params · 1 result

MusicGen Large

3.3B params · 1 result

§ 06 · Sources & freshness

Where these numbers come from.

editorial

result

arxiv