Object Detection

Detecting and localizing objects in images with bounding boxes and class labels.

Datasets

Results

box-map

Canonical metric

Canonical Benchmark

COCO

Microsoft COCO is the gold standard for large-scale object detection, segmentation, and captioning, with 330k+ images, 1.5M+ object instances, and 80 categories. Primary metric is box mAP averaged over 10 IoU thresholds (0.5:0.95).

Primary metric: box-map

View full leaderboard

Top 10

Leading models on COCO.

Rank	Model	box-map	Year	Source
1	ScyllaNet	66.1	2026	paper
2	co-detr-swin-l	66.0	2025	paper
3	CW_Detection	66.0	2026	paper
4	Thinker	66.0	2026	paper
5	SenseTime Basemodel	66.0	2026	paper
6	InternImage-H (OneFormer)	65.5	2026	paper
7	internimage-h	65.4	2025	paper
8	Focal-Stable-DINO	64.6	2023	paper
9	dino-swin-l	63.3	2025	paper
10	DINO-ViT-L	63.3	2026	paper

What were you looking for on Object Detection?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

All datasets

3 datasets tracked for this task.

Related tasks

Other tasks in Computer Vision.

Depth Estimation Document Image Classification Document Layout Analysis Document Parsing Document Understanding General OCR Capabilities Handwriting Recognition Image Classification

Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Object Detection? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.

COCO

Top 10

What were you looking for on Object Detection?

All datasets

COCO

LVIS v1.0

Pascal VOC 2012

Related tasks

Didn't find what you came for?