Codesota · Benchmark · publaynet-valHome/Leaderboards/Vision & Documents/Document Layout Analysis/publaynet-val

Unknown

publaynet-val.

publaynet-val is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for publaynet-val.

Paper ↗Leaderboard ↓

§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?

Use row edits to send a sourced correction into moderation.

Add / edit result ↗Report issue ↗

Table

Table is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Tableverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. Table AP 98.6% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.	unverified	0.99	2024	Source ↗	Looks wrong?

Figure

Figure is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Figureverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. Figure AP 98.5% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.	unverified	0.98	2024	Source ↗	Looks wrong?

Table

Table is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Tableverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.98	2023	Paper ↗	Looks wrong?
02	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.98	2023	Paper ↗Code ↗	Looks wrong?
03	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.98	2022	Paper ↗Code ↗	Looks wrong?
04	CDeC-Net From paper: CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images	verified	0.98	2020	Paper ↗Code ↗	Looks wrong?
05	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.98	2022	Paper ↗Code ↗	Looks wrong?
06	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.98	2024	Paper ↗	Looks wrong?
07	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.98	2023	Paper ↗Code ↗	Looks wrong?
08	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.98	2022	Paper ↗	Looks wrong?
09	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.97	2021	Paper ↗Code ↗	Looks wrong?
10	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.97	2021	Paper ↗Code ↗	Looks wrong?
11	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.97	2022	Paper ↗	Looks wrong?
12	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.97	2020	Paper ↗Code ↗	Looks wrong?
13	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.96	2019	Paper ↗Code ↗	Looks wrong?
14	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.95	2019	Paper ↗Code ↗	Looks wrong?
15	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.87	2023	Paper ↗Code ↗	Looks wrong?

Text

Text is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Textverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. Text AP 98.0% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.	unverified	0.98	2024	Source ↗	Looks wrong?

Figure

Figure is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Figureverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.97	2023	Paper ↗	Looks wrong?
02	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.97	2022	Paper ↗Code ↗	Looks wrong?
03	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.97	2023	Paper ↗Code ↗	Looks wrong?
04	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.97	2024	Paper ↗	Looks wrong?
05	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.97	2022	Paper ↗Code ↗	Looks wrong?
06	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.97	2023	Paper ↗Code ↗	Looks wrong?
07	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.97	2022	Paper ↗	Looks wrong?
08	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.96	2021	Paper ↗Code ↗	Looks wrong?
09	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.96	2022	Paper ↗	Looks wrong?
10	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.96	2020	Paper ↗Code ↗	Looks wrong?
11	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.96	2021	Paper ↗Code ↗	Looks wrong?
12	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.95	2019	Paper ↗Code ↗	Looks wrong?
13	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.94	2019	Paper ↗Code ↗	Looks wrong?
14	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.21	2023	Paper ↗Code ↗	Looks wrong?

List

List is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Listverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.97	2022	Paper ↗	Looks wrong?
02	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.97	2023	Paper ↗Code ↗	Looks wrong?
03	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.96	2023	Paper ↗	Looks wrong?
04	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.96	2022	Paper ↗Code ↗	Looks wrong?
05	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.96	2024	Paper ↗	Looks wrong?
06	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.95	2022	Paper ↗Code ↗	Looks wrong?
07	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.95	2021	Paper ↗Code ↗	Looks wrong?
08	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.94	2023	Paper ↗Code ↗	Looks wrong?
09	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.94	2022	Paper ↗	Looks wrong?
10	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.92	2021	Paper ↗Code ↗	Looks wrong?
11	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.92	2020	Paper ↗Code ↗	Looks wrong?
12	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.89	2019	Paper ↗Code ↗	Looks wrong?
13	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.88	2019	Paper ↗Code ↗	Looks wrong?
14	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.86	2023	Paper ↗Code ↗	Looks wrong?

Overall

Overall is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Overallverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. Overall mAP 97.3% on PubLayNet-val. Best result as of ICDAR 2024. Query encoding + hybrid matching. arXiv 2404.17888.	unverified	0.97	2024	Source ↗	Looks wrong?
02	RoDLA RoDLA. 96.0 overall mAP on clean PubLayNet-val. DINO + InternImage backbone. Robustness benchmark paper. CVPR 2024. arXiv 2403.14442.	unverified	0.96	2024	Source ↗	Looks wrong?

List

List is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Listverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. List AP 97.3% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.	unverified	0.97	2024	Source ↗	Looks wrong?

Text

Text is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Textverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.97	2021	Paper ↗Code ↗	Looks wrong?
02	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.96	2022	Paper ↗	Looks wrong?
03	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.95	2023	Paper ↗Code ↗	Looks wrong?
04	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.95	2023	Paper ↗	Looks wrong?
05	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.94	2022	Paper ↗Code ↗	Looks wrong?
06	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.94	2024	Paper ↗	Looks wrong?
07	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.94	2022	Paper ↗Code ↗	Looks wrong?
08	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.94	2022	Paper ↗	Looks wrong?
09	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.93	2021	Paper ↗Code ↗	Looks wrong?
10	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.93	2020	Paper ↗Code ↗	Looks wrong?
11	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.93	2023	Paper ↗Code ↗	Looks wrong?
12	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.92	2019	Paper ↗Code ↗	Looks wrong?
13	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.91	2019	Paper ↗Code ↗	Looks wrong?
14	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.88	2023	Paper ↗Code ↗	Looks wrong?

Overall

Overall is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Overallverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.96	2023	Paper ↗Code ↗	Looks wrong?
02	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.96	2022	Paper ↗	Looks wrong?
03	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.96	2021	Paper ↗Code ↗	Looks wrong?
04	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.96	2023	Paper ↗	Looks wrong?
05	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.95	2022	Paper ↗Code ↗	Looks wrong?
06	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.95	2022	Paper ↗Code ↗	Looks wrong?
07	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.95	2024	Paper ↗	Looks wrong?
08	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.94	2022	Paper ↗	Looks wrong?
09	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.94	2023	Paper ↗Code ↗	Looks wrong?
10	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.93	2020	Paper ↗Code ↗	Looks wrong?
11	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.93	2021	Paper ↗Code ↗	Looks wrong?
12	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.91	2019	Paper ↗Code ↗	Looks wrong?
13	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.90	2019	Paper ↗Code ↗	Looks wrong?
14	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.72	2023	Paper ↗Code ↗	Looks wrong?

Title

Title is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Titleverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Hybrid DLA (Shehzadi et al.) Hybrid Approach for DLA. Title AP 94.2% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.	unverified	0.94	2024	Source ↗	Looks wrong?

Title

Title is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Titleverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	VGT From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.94	2023	Paper ↗Code ↗	Looks wrong?
02	VSR From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations	verified	0.93	2021	Paper ↗Code ↗	Looks wrong?
03	TRDLU From paper: Transformer-based Approach for Document Understanding	verified	0.92	2022	Paper ↗	Looks wrong?
04	DETR From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	verified	0.92	2023	Paper ↗	Looks wrong?
05	LayoutLMv3-B From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	verified	0.91	2022	Paper ↗Code ↗	Looks wrong?
06	DoPTA From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	verified	0.90	2024	Paper ↗	Looks wrong?
07	DiT-L From paper: DiT: Self-supervised Pre-training for Document Image Transformer	verified	0.89	2022	Paper ↗Code ↗	Looks wrong?
08	UDoc From paper: Unified Pretraining Framework for Document Understanding	verified	0.89	2022	Paper ↗	Looks wrong?
09	DeiT-B From paper: Training data-efficient image transformers & distillation through attention	verified	0.87	2020	Paper ↗Code ↗	Looks wrong?
10	BEiT-B From paper: BEiT: BERT Pre-Training of Image Transformers	verified	0.87	2021	Paper ↗Code ↗	Looks wrong?
11	ResNext-101-32×8d From paper: Vision Grid Transformer for Document Layout Analysis	verified	0.86	2023	Paper ↗Code ↗	Looks wrong?
12	Mask R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.84	2019	Paper ↗Code ↗	Looks wrong?
13	Faster R-CNN From paper: PubLayNet: largest dataset ever for document layout analysis	verified	0.83	2019	Paper ↗Code ↗	Looks wrong?
14	GLAM From paper: A Graphical Approach to Document Layout Analysis	verified	0.80	2023	Paper ↗Code ↗	Looks wrong?

§ 04 · Submit a result

Add to the leaderboard.

← Back to Document Layout Analysis