Computer Visionobject-detection

Object Detection

Detecting and localizing objects in images with bounding boxes and class labels.

3
Datasets
28
Results
box-map
Canonical metric
Canonical Benchmark

COCO

Microsoft COCO is the gold standard for large-scale object detection, segmentation, and captioning, with 330k+ images, 1.5M+ object instances, and 80 categories. Primary metric is box mAP averaged over 10 IoU thresholds (0.5:0.95).

Primary metric: box-map
View full leaderboard

Top 10

Leading models on COCO.

RankModelbox-mapYearSource
1
ScyllaNet
66.12026paper
2
co-detr-swin-l
66.02025paper
3
CW_Detection
66.02026paper
4
Thinker
66.02026paper
5
SenseTime Basemodel
66.02026paper
6
InternImage-H (OneFormer)
65.52026paper
7
internimage-h
65.42025paper
8
Focal-Stable-DINO
64.62023paper
9
dino-swin-l
63.32025paper
10
DINO-ViT-L
63.32026paper

What were you looking for on Object Detection?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

All datasets

3 datasets tracked for this task.

Related tasks

Other tasks in Computer Vision.

Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Object Detection? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.