Codesota · Tasks · Object DetectionHome/Tasks/Computer Vision/Object Detection

Object Detection.

Object Detection is a computer vision task that involves identifying and localizing objects within an image. The goal is to detect instances or objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Object detection models typically output a set of bounding boxes with corresponding predicted class names.

Datasets

104

Results

box-map

Canonical metric

§ 02 · Canonical benchmark

The reference dataset.

COCO

Microsoft COCO is the gold standard for large-scale object detection, segmentation, and captioning, with 330k+ images, 1.5M+ object instances, and 80 categories. Primary metric is box mAP averaged over 10 IoU thresholds (0.5:0.95).

Primary metric: box-map

View full leaderboard →

§ 03 · Top 10

Leading models.

Leading models on COCO.

#	Model	box-map	Year	Source
★	ScyllaNet	66.1	2026	paper ↗
2	DINOv3 + Plain-DETR + TTA	66.1	2025	paper ↗
3	Co-DETR (Swin-L)	66.0	2022	paper ↗
4	Co-DETR (Swin-L)	66.0	2026	paper ↗
5	SenseTime Basemodel	66.0	2026	paper ↗
6	CW_Detection	66.0	2026	paper ↗
7	Co-DETR (Swin-L)	66.0	2025	paper ↗
8	Thinker	66.0	2026	paper ↗
9	DINOv3 + Plain-DETR	65.6	2025	paper ↗
10	InternImage-H (OneFormer)	65.5	2026	paper ↗