Codesota · OCR · Benchmark · CIFAR-10015 scored runs · 15 distinct modelsUpdated 2026-05-21

§ 00 · Opening

100 classes, 32 pixels, forever contested.

CIFAR-100 is the long-standing small-image classification benchmark: 60,000 thumbnails across 100 fine-grained categories. It is the test that every new vision backbone has to clear before anyone takes it seriously.

§ 01 · Leaderboard · Top-1 accuracy

Top-1 accuracy, ranked.

Share of CIFAR-100 test images whose top prediction matches the gold label. (higher is better)

#	Model	Top-1 accuracy	Verified	Source
01	EVA-02-L Fetched from CodeSOTA API on 2026-04-20	97.15	—	codesota-api
02	CoAtNet-7 Fetched from CodeSOTA API on 2026-04-20	96.38	—	codesota-api
03	ConvNeXt V2-H Fetched from CodeSOTA API on 2026-04-20	96.17	—	codesota-api
04	MAE ViT-H/14 Fetched from CodeSOTA API on 2026-04-20	96.08	—	codesota-api
05	SwinV2-G Fetched from CodeSOTA API on 2026-04-20	96.01	—	codesota-api
06	DeiT III-H/14 Fetched from CodeSOTA API on 2026-04-20	95.94	—	codesota-api
07	InternImage-XL Fetched from CodeSOTA API on 2026-04-20	95.77	—	codesota-api
08	FasterViT-6 Fetched from CodeSOTA API on 2026-04-20	95.72	—	codesota-api
09	vit-h-14 Fetched from CodeSOTA API on 2026-04-20	94.55	—	codesota-api
10	AIMv2-3B Fetched from CodeSOTA API on 2026-04-20	94.50	yes	codesota-api
11	AIMv2-1B Fetched from CodeSOTA API on 2026-04-20	94.10	yes	codesota-api
12	ViT-L/16 (IN-21K) Fetched from CodeSOTA API on 2026-04-20	93.25	yes	codesota-api
13	efficientnet-b7 Fetched from CodeSOTA API on 2026-04-20	91.70	—	codesota-api
14	vit-b-16 Fetched from CodeSOTA API on 2026-04-20	91.48	—	codesota-api
15	resnet-50 Fetched from CodeSOTA API on 2026-04-20	78.04	—	codesota-api

Fig · 15 results on Top-1 accuracy. Rows sourced from benchmarks.json; shaded row marks current SOTA.

§ What it measures

Top-1 accuracy, single crop.

The CIFAR-100 headline metric is Top-1 accuracy — the share of test images whose highest-probability prediction matches the single gold label. Single-crop, no test-time augmentation unless the authors explicitly state otherwise.

Because the 100 classes are deliberately fine-grained (e.g. 20 mammals broken into individual species), Top-1 is a harsher metric than on CIFAR-10. Models above 95% are considered strong.

§ Dataset details

60,000 thumbnails, 100 hand-picked classes.

CIFAR-100, curated by the Canadian Institute for Advanced Research, contains 60,000 colour images of 32×32 pixels across 100 classes grouped into 20 superclasses. Each class has 500 training images and 100 test images.

The dataset has been in continuous use since 2009. That means two things: results are extremely comparable across a decade of papers, and saturation is a real concern — the gap between first and fifteenth place on the table below fits in three percentage points.

§ How scores are verified

Reported, then reproduced.

Every row is imported from benchmarks.json. The reported number is the authors’ single-best Top-1 on the canonical test split. Rows marked verified have been re-run by the Codesota harness or matched against an independent reproduction.

Full policy: /methodology.

§ Final · Related OCR benchmarks