Home/Browse/Speech/Text-to-Speech/VCTK

VCTK

Unknown

Speech data from 110 English speakers with various accents. Used for multi-speaker TTS.

Benchmark Stats

Models9
Papers14
Metrics2

SOTA History

mos

mos

Higher is better

RankModelSourceScoreYearPaper
1NaturalSpeech 3

MOS (1–5). Zero-shot VCTK evaluation. Source: Table 3, arxiv:2403.03100 (2024)

Community4.362026Source
2NaturalSpeech 3

MOS (1–5). Zero-shot VCTK evaluation. Source: Table 3, arxiv:2403.03100 (2024)

Community4.362026Source
3Ground Truth (VCTK)

Human recordings from VCTK test set. Reported in YourTTS (Casanova et al., ICML 2022), Table 1.

Community4.262022Source
4VITS

MOS (1–5). VITS multispeaker on VCTK. Source: Table 2, arxiv:2106.06103 (ICML 2021)

Community4.212026Source
5StyleTTS2

MOS (1–5). StyleTTS 2 multispeaker on VCTK. Source: Table 3, arxiv:2306.07279 (NeurIPS 2023)

Community4.192023Source
6VALL-E 2

MOS (1–5). Zero-shot multi-speaker on VCTK. Source: Table 1, arxiv:2406.05370 (Jun 2024)

Community4.182026Source
7XTTS v2

MOS (1–5). XTTS v2 zero-shot on VCTK speakers. Source: arxiv:2304.01196

Community4.142026Source
8YourTTS

MOS (1–5). YourTTS zero-shot on VCTK. Source: Table 2, arxiv:2202.04053 (ICML 2022)

Community4.072022Source
9SC-GlowTTS

Multi-speaker GlowTTS baseline. Reported in YourTTS (Casanova et al., ICML 2022), Table 1.

Community3.782022Source

sim-score

sim-score

Higher is better

RankModelSourceScoreYearPaper
1Ground Truth (VCTK)

Sim-MOS for human recordings, VCTK test set. Reported in YourTTS (Casanova et al., ICML 2022), Table 1.

Community4.192022Source
2YourTTS

Sim-MOS on VCTK test set (Exp 1 monolingual) ±0.05. Casanova et al., ICML 2022.

Community4.162022Source
3VITS2

Speaker similarity MOS on VCTK multi-speaker test set ±0.08. Kong et al., Interspeech 2023.

Community3.992023Source
4SC-GlowTTS

Sim-MOS on VCTK test set. Reported in YourTTS (Casanova et al., ICML 2022), Table 1.

Community3.992022Source
5VITS

Speaker similarity MOS on VCTK multi-speaker test set ±0.09. Kong et al., Interspeech 2023 (VITS2 paper, Table 2b).

Community3.792023Source

Submit a Result