Codesota · Computer Vision · Image Classification · VTAB (19 tasks)Tasks/Computer Vision/Image Classification
Image Classification · benchmark dataset · EN

Visual Task Adaptation Benchmark (VTAB).

VTAB (Visual Task Adaptation Benchmark) is a benchmark suite of 19 image classification tasks designed to evaluate how well general visual representations adapt to diverse, unseen tasks with limited labeled data. VTAB frames all tasks as classification problems (to provide a consistent API) and emphasizes low-data transfer: the commonly-used VTAB-1k protocol uses 1,000 training examples per task and reports the mean (top-1) accuracy averaged across the 19 tasks; VTAB also supports a full-dataset evaluation scenario. The benchmark tasks are drawn from multiple domains (commonly described as Natural, Specialized and Structured groups) to exercise different aspects of representations. VTAB places one key constraint on pre-training: the evaluation datasets must not be used during pre-training. Public resources: project site and leaderboard (https://google-research.github.io/task_adaptation/), code and data splits on GitHub (https://github.com/google-research/task_adaptation), the OpenReview (ICLR 2020) page for “The Visual Task Adaptation Benchmark” (https://openreview.net/forum?id=BJena3VtwS), and the related arXiv paper “A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark” (arXiv:1910.04867, https://arxiv.org/abs/1910.04867).

Paper Submit a result
§ 01 · Leaderboard

Best published scores.

No results indexed yet — be the first to submit a score.

No benchmark results indexed yet
§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies