Mask2Former (Swin-L).

Meta AIopen-sourceUnknown paramsMasked-attention Mask Transformer + Swin-L

Unified architecture for panoptic, instance, and semantic segmentation. 57.3 mIoU on ADE20K. CVPR 2022.

§ 02 · Benchmarks

Every benchmark Mask2Former (Swin-L) has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	ADE20K	Computer Vision · Semantic Segmentation	mIoU	57.3%	#4/6	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where Mask2Former (Swin-L) actually performs.

Computer Vision

benchmark

avg rank #4.0

§ 05 · Related models

Other Meta AI models scored on Codesota.

GENRE

1 result · 1 SOTA

SeamlessM4T v2 Large

2.3B params · 1 result · 1 SOTA

wav2vec 2.0 Large (960h)

317M params · 3 results

HuBERT Large (LS-960)

317M params · 2 results

DINOv2 (ViT-g) + Linear

Unknown params · 1 result

Fairseq S2T (MuST-C)

~150M params · 1 result

MusicGen Large

3.3B params · 1 result

Voicebox

330M params · 1 result

§ 06 · Sources & freshness

Where these numbers come from.

arxiv

result

0 of 1 rows marked verified.