Visual Task Adaptation Benchmark (VTAB).

VTAB (Visual Task Adaptation Benchmark) is a benchmark suite of 19 image classification tasks designed to evaluate how well general visual representations adapt to diverse, unseen tasks with limited labeled data. VTAB frames all tasks as classification problems (to provide a consistent API) and emphasizes low-data transfer: the commonly-used VTAB-1k protocol uses 1,000 training examples per task and reports the mean (top-1) accuracy averaged across the 19 tasks; VTAB also supports a full-dataset evaluation scenario. The benchmark tasks are drawn from multiple domains (commonly described as Natural, Specialized and Structured groups) to exercise different aspects of representations. VTAB places one key constraint on pre-training: the evaluation datasets must not be used during pre-training. Public resources: project site and leaderboard (https://google-research.github.io/task_adaptation/), code and data splits on GitHub (https://github.com/google-research/task_adaptation), the OpenReview (ICLR 2020) page for “The Visual Task Adaptation Benchmark” (https://openreview.net/forum?id=BJena3VtwS), and the related arXiv paper “A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark” (arXiv:1910.04867, https://arxiv.org/abs/1910.04867).

Paper ↗Submit a result ↵

§ 01 · Leaderboard

Best published scores.

No results indexed yet — be the first to submit a score.

No benchmark results indexed yet

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

Visual Task Adaptation Benchmark (VTAB).

Best published scores.

Neighbouring benchmarks.

Have a score that beatsthis table?

Have a score that beats
this table?