ViTDet-H.

Meta AIopen-sourceUnknown paramsPlain ViT-Huge + Cascade Mask R-CNN

Plain non-hierarchical ViT for detection. 53.4 APbox on LVIS v1.0. NeurIPS 2022.

§ 02 · Benchmarks

Every benchmark ViTDet-H has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	LVIS v1.0	Computer Vision · Object Detection	mask-ap	53.4%	#7/9	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where ViTDet-H actually performs.

Computer Vision

benchmark

avg rank #7.0

§ 05 · Related models

Other Meta AI models scored on Codesota.

GENRE

1 result · 1 SOTA

SeamlessM4T v2 Large

2.3B params · 1 result · 1 SOTA

wav2vec 2.0 Large (960h)

317M params · 3 results

HuBERT Large (LS-960)

317M params · 2 results

DINOv2 (ViT-g) + Linear

Unknown params · 1 result

Fairseq S2T (MuST-C)

~150M params · 1 result

Mask2Former (Swin-L)

Unknown params · 1 result

MusicGen Large

3.3B params · 1 result

§ 06 · Sources & freshness

Where these numbers come from.

arxiv

result

0 of 1 rows marked verified.