Codesota · Models · DINO-XIDEA Research2 results · 1 benchmarks
Model card

DINO-X.

IDEA Researchopen-sourceUnknown paramsUnified vision model with DINO-based detection head + large language model

Unified vision model for open-world object detection. Achieves 67.0 mask AP on LVIS v1.0 minival — SOTA at time of release (Nov 2024). Supports open-vocabulary and grounded detection. arXiv 2411.14347.

§ 01 · Benchmarks

Every benchmark DINO-X has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01LVIS v1.0Computer Vision · Object Detectionbox-ap71.4%#1/42024-11-21source ↗
02LVIS v1.0Computer Vision · Object Detectionmask-ap67.0%#1/92024-11-21source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where DINO-X actually performs.

Computer Vision
1
benchmark
avg rank #1.0
§ 03 · Papers

1 paper with results for DINO-X.

  1. 2024-11-21· Computer Vision· 2 results

    DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding

§ 04 · Related models

Other IDEA Research models scored on Codesota.

DINO (Swin-L)
Unknown params · 1 result
Grounding DINO
Unknown params · 1 result
§ 05 · Sources & freshness

Where these numbers come from.

arxiv
2
results
2 of 2 rows marked verified.