Computer Vision
Research focused on enabling computers to interpret and understand visual information from images and videos, including tasks such as image classification, object detection, segmentation, and visual recognition.
15 tasks202 datasets94 results
Tasks & Benchmarks
Object Detection
11 benchmarks46 results
Image Classification
28 benchmarks44 results
Image segmentation
9 benchmarks3 results
OCR
5 benchmarks1 results
Image editing
5 benchmarks0 results
Image generation
11 benchmarks0 results
Object counting
1 benchmarks0 results
Open-Vocabulary Object Detection
2 benchmarks0 results
Video classification
9 benchmarks0 results
Video generation
0 benchmarks0 results
3D generation
0 benchmarks0 results
Video segmentation
3 benchmarks0 results
3D Understanding
4 benchmarks0 results
Depth estimation
17 benchmarks0 results
Few-Shot Image Classification
97 benchmarks0 results
Show all datasets and SOTA results
Object Detection
COCO2014
66.12(box-map)ScyllaNet
COCO 2014 val2014
COCO test-dev2014
COCO val20172014
LVIS v1.02019
71.4(box-ap)DINO-X
Pascal VOC 20122012
80(mAP-coco-pretrain)SSD512 (VGG-16)
Image Classification
CIFAR-102009
99.1(accuracy)DeiT-B Distilled
CIFAR-1002009
94.55(accuracy)ViT-H/14
ImageNet2009
97.75(top-5-accuracy)SENet
ImageNet-1K2012
91(top-1-accuracy)CoCa (finetuned)
ImageNet-V22019
84(top-1-accuracy)Swin Transformer V2 Large
Image segmentation
0.77(ODS)Segment Anything Model (SAM)
46.5(mAP)Segment Anything Model (SAM)
44.7(mAP)Segment Anything Model (SAM)
OCR
860(Score)HunyuanOCR (1B)
olmOCR-Bench2025
Image generation
Object counting
Open-Vocabulary Object Detection
Video classification
Video generation
No datasets indexed yet. Contribute on GitHub
3D generation
No datasets indexed yet. Contribute on GitHub
Video segmentation
3D Understanding
Depth estimation
Few-Shot Image Classification
Get notified when these results update
New models drop weekly. We track them so you don't have to.