Codesota · Computer Vision · Semantic Segmentation · ADE20KTasks/Computer Vision/Semantic Segmentation
Semantic Segmentation · benchmark dataset · 2016 · EN

ADE20K Scene Parsing Benchmark.

20K training, 2K validation images annotated with 150 object categories. Complex scene parsing benchmark.

Paper Download datasetSubmit a result
§ 01 · Leaderboard

Best published scores.

21 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
mIoU · higher is better
All metrics
mIoU, miou
mIoU· primary
6 rows
#ModelOrgSubmittedPaper / codemIoU
01InternImage-HOpenShanghai AI LabDec 2025arxiv-paper62.90
02BEiT-3 (ViT-L)OpenMicrosoftMar 2026arxiv62.80
03DINOv2 (ViT-g) + LinearOpenMeta AIMar 2026arxiv62
04Mask2Former (Swin-L)OpenMeta AIMar 2026arxiv57.30
05Mask2Former (Swin-L)OpenMeta AI / UIUCDec 2025arxiv-paper57.30
06Swin-L + UperNetOpenMicrosoftMar 2026arxiv53.50
miou
15 rows
#ModelOrgSubmittedPaper / codemiou
01DINOv3 + Mask2Former (simple) Aug 2025DINOv3 · code62.60
02EoMT (ViT-L)Mar 2025Your ViT is Secretly an Image Segmentation Model · code58.40
03BEiT-L+Jun 2021BEiT: BERT Pre-Training of Image Transformers · code57.90
04OneFormer (Swin-L)Nov 2022OneFormer: One Transformer to Rule Universal Image Segme… · code57
05Mask2Former + Swin-L-FaPNDec 2021Masked-attention Mask Transformer for Universal Image Se… · code56.40
06Mask2Former (Swin-L)OpenMeta AI / UIUCDec 2021Masked-attention Mask Transformer for Universal Image Se… · code56.10
07DINOv3 + linear probeAug 2025DINOv3 · code55.90
08ConvNeXt (XL)Jan 2022A ConvNet for the 2020s · code54
09MAE (ViT-H, 448)Nov 2021Masked Autoencoders Are Scalable Vision Learners · code53.60
10DINOv2 (ViT-g/14)Apr 2023DINOv2: Learning Robust Visual Features without Supervis… · code53
11Mask2Former + Swin-TDec 2021Masked-attention Mask Transformer for Universal Image Se… · code47.70
12Mask2Former + ResNet-50Dec 2021Masked-attention Mask Transformer for Universal Image Se… · code47.20
13MaskFormer (Swin-T)Jul 2021Per-Pixel Classification is Not All You Need for Semanti… · code46.70
14SigLIP 2 (g/16)Feb 2025SigLIP 2: Multilingual Vision-Language Encoders with Imp… · code45.40
15SegFormer (MiT-B0)May 2021SegFormer: Simple and Efficient Design for Semantic Segm… · code37.40
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

1 steps
of state of the art.

Each row below marks a model that broke the previous record on mIoU. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · mIoU
  1. Dec 18, 2025InternImage-HShanghai AI Lab62.90
Fig 3 · SOTA-setting models only. 1 entries span Dec 2025 Dec 2025.
§ 04 · Literature

11 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies
ADE20K — Semantic Segmentation | CodeSOTA