ImageNet-1k, still the deciding test.
ImageNet-1k remains the reference for large-scale image classification. It ranks vision backbones on Top-1 accuracy across 1,000 categories — the same score every new architecture has to beat before it can earn its place on a datasheet.
Top-1 accuracy, ranked.
Share of ImageNet-1k validation images whose top prediction matches the gold label. (higher is better)
| # | Model | Top-1 accuracy | Verified | Source |
|---|---|---|---|---|
| 01 | coca-finetuned Fetched from CodeSOTA API on 2026-04-20 | 91.00 | — | codesota-api |
| 02 | vit-g-14 Fetched from CodeSOTA API on 2026-04-20 | 90.45 | — | codesota-api |
| 03 | EVA-02-L Fetched from CodeSOTA API on 2026-04-20 | 90.06 | yes | codesota-api |
| 04 | EVA-Giant Fetched from CodeSOTA API on 2026-04-20 | 89.79 | yes | codesota-api |
| 05 | InternImage-H Fetched from CodeSOTA API on 2026-04-20 | 89.60 | yes | codesota-api |
| 06 | SigLIP-SO400M Fetched from CodeSOTA API on 2026-04-20 | 89.41 | yes | codesota-api |
| 07 | convnext-v2-huge Fetched from CodeSOTA API on 2026-04-20 | 88.90 | — | codesota-api |
| 08 | ViT-H/14 CLIP (LAION-2B) Fetched from CodeSOTA API on 2026-04-20 | 88.63 | yes | codesota-api |
| 09 | ConvNeXt-XXLarge (CLIP LAION) Fetched from CodeSOTA API on 2026-04-20 | 88.62 | yes | codesota-api |
| 10 | vit-h-14 Fetched from CodeSOTA API on 2026-04-20 | 88.55 | — | codesota-api |
| 11 | swin-large Fetched from CodeSOTA API on 2026-04-20 | 87.30 | — | codesota-api |
| 12 | efficientnet-v2-l Fetched from CodeSOTA API on 2026-04-20 | 85.70 | — | codesota-api |
| 13 | deit-b-distilled Fetched from CodeSOTA API on 2026-04-20 | 85.20 | — | codesota-api |
| 14 | efficientnet-b7 Fetched from CodeSOTA API on 2026-04-20 | 84.40 | — | codesota-api |
| 15 | deit-b Fetched from CodeSOTA API on 2026-04-20 | 83.10 | — | codesota-api |
| 16 | convnext-v2-tiny Fetched from CodeSOTA API on 2026-04-20 | 83.00 | — | codesota-api |
| 17 | vit-l-16 Fetched from CodeSOTA API on 2026-04-20 | 82.70 | — | codesota-api |
| 18 | vit-b-16 Fetched from CodeSOTA API on 2026-04-20 | 81.20 | — | codesota-api |
| 19 | resnet-50-a3 Fetched from CodeSOTA API on 2026-04-20 | 80.40 | — | codesota-api |
| 20 | resnet-152 Fetched from CodeSOTA API on 2026-04-20 | 78.60 | — | codesota-api |
| 21 | efficientnet-b0 Fetched from CodeSOTA API on 2026-04-20 | 77.10 | — | codesota-api |
| 22 | resnet-50 Fetched from CodeSOTA API on 2026-04-20 | 76.15 | — | codesota-api |
Top-1 accuracy, 1,000-way.
Top-1 accuracy is the share of test images whose argmax prediction matches the single gold label, out of 1,000 possible classes. It is a strict metric — Top-5 (any of the top five predictions being correct) has been essentially solved since 2018; Top-1 is where the last few points of progress still live.
A meaningful share of the ImageNet error rate at this point comes from label noise — images where the “correct” label is genuinely ambiguous. That is why the gap between 88% and 91% is a harder climb than it looks.
1.28M training images, 50K validation.
ImageNet-1k is the ILSVRC 2012 classification subset: 1.28 million training images, 50,000 validation images and 100,000 test images spread across 1,000 categories. Since the test labels are held private by the organisers, the community has standardised on validation Top-1 as the public leaderboard number.
Most frontier models score on validation, not test. When an entry reports a number computed against an external dataset, we surface that in the notes column.
Reported, then reproduced.
Each row above is imported verbatim from benchmarks.json. Where the reporting paper gives multiple settings (different crop sizes, different pretraining regimens), the row reflects the single headline number the authors highlight in their abstract or table.
Rows marked verified have been matched against an independent reproduction. See the Codesota methodology for the policy.