Codesota · Computer Vision · Object Detection · COCOTasks/Computer Vision/Object Detection
Object Detection · benchmark dataset · 2014 · EN

Microsoft Common Objects in Context.

Microsoft COCO is the gold standard for large-scale object detection, segmentation, and captioning, with 330k+ images, 1.5M+ object instances, and 80 categories. Primary metric is box mAP averaged over 10 IoU thresholds (0.5:0.95).

Paper Download datasetSubmit a result
§ 01 · Leaderboard

Best published scores.

24 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
box-map · higher is better
All metrics
box-map, mAP
box-map· primary
11 rows
#ModelOrgSubmittedPaper / codebox-map
01ScyllaNetAPIScylla TechnologiesSep 2025editorial66.12
02ThinkerUBTECHAug 2024editorial66
03CW_DetectionIndependentJan 2025editorial66
04SenseTime BasemodelAPISenseTimeNov 2024editorial66
05InternImage-H (OneFormer)OSSPJLab & TsinghuaMar 2024InternImage: Exploring Large-Scale Vision Foundation Mod…65.50
06DINO-ViT-LOSSIDEA-ResearchMar 2023DINO: DETR with Improved DeNoising Anchor Boxes for End-…63.30
07ViT-Adapter-LOSSNanjing UniversityNov 2022Vision Transformer Adapter for Dense Predictions60.50
08Swin-L (Cascade R-CNN)OSSMicrosoft ResearchJul 2021Swin Transformer: Hierarchical Vision Transformer using …58.90
09DETROSSMeta AI / FAIRMay 2020editorial43.30
10Mask R-CNNOSSMeta AI / FAIRMar 2017Mask R-CNN39.80
11Faster R-CNNOSSMicrosoft ResearchJun 2015editorial37.40
mAP
13 rows
#ModelOrgSubmittedPaper / codemAP
01Co-DETR (Swin-L)OSSResearchMar 2026arxiv66
02Co-DETR (Swin-L)OSSResearchDec 2025arxiv-paper66
03InternImage-HOSSShanghai AI LabDec 2025arxiv-paper65.40
04InternImage-HOSSShanghai AI LabMar 2026arxiv65.40
05DINO (Swin-L)OSSResearchDec 2025arxiv-paper63.30
06DINO (Swin-L)OSSIDEA ResearchMar 2026arxiv63.30
07Grounding DINOOSSIDEA ResearchMar 2026arxiv63
08EVA-02-LOSSBAAIMar 2026arxiv62.30
09YOLOv10-XOSSTsinghua UniversityDec 2025github-readme57.40
10EfficientDet-D7xOSSGoogleDec 2025google-research55.10
11YOLO11xOSSUltralyticsMar 2026official54.70
12YOLOv10-XOSSTsinghua UniversityMar 2026arxiv54.40
13RT-DETRv2-XOSSBaiduMar 2026arxiv54.30
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

9 steps
of state of the art.

Each row below marks a model that broke the previous record on box-map. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · box-map
  1. Jun 4, 2015Faster R-CNNMicrosoft Research37.40
  2. Mar 20, 2017Mask R-CNNMeta AI / FAIR39.80
  3. May 26, 2020DETRMeta AI / FAIR43.30
  4. Jul 1, 2021Swin-L (Cascade R-CNN)Microsoft Research58.90
  5. Nov 1, 2022ViT-Adapter-LNanjing University60.50
  6. Mar 1, 2023DINO-ViT-LIDEA-Research63.30
  7. Mar 1, 2024InternImage-H (OneFormer)PJLab & Tsinghua65.50
  8. Aug 1, 2024ThinkerUBTECH66
  9. Sep 1, 2025ScyllaNetScylla Technologies66.12
Fig 3 · SOTA-setting models only. 9 entries span Jun 2015 Sep 2025.
§ 04 · Literature

5 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies