Image Classification
Categorizing images into predefined classes (ImageNet, CIFAR).
Image Classification is a key task in computer vision. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.
Benchmarks & SOTA
ImageNet-1K
ImageNet Large Scale Visual Recognition Challenge 2012
1.28M training images, 50K validation images across 1,000 object classes. The standard benchmark for image classification since 2012.
State of the Art
CoCa (finetuned)
91
top-1-accuracy
CIFAR-100
Canadian Institute for Advanced Research 100
60K 32x32 color images in 100 fine-grained classes grouped into 20 superclasses. More challenging than CIFAR-10.
State of the Art
ViT-H/14
94.55
accuracy
CIFAR-10
Canadian Institute for Advanced Research 10
60K 32x32 color images in 10 classes. Classic small-scale image classification benchmark with 50K training and 10K test images.
State of the Art
DeiT-B Distilled
Meta
99.1
accuracy
ImageNet-V2
ImageNet-V2 Matched Frequency
10K new test images following ImageNet collection process. Tests model generalization beyond the original test set.
State of the Art
Swin Transformer V2 Large
Microsoft
84
top-1-accuracy
Related Tasks
General OCR Capabilities
Comprehensive benchmarks covering multiple aspects of OCR performance.
Polish OCR
OCR for Polish language including historical documents, gothic fonts, and diacritic recognition.
Object Detection
Locating and classifying objects in images (COCO, Pascal VOC).
Semantic Segmentation
Pixel-level classification of images (Cityscapes, ADE20K).