Document Layout Analysis2020en

publaynet-val

Dataset from Papers With Code

Metrics:accuracy, cer, wer, f1

figure

#	Model	Score	Paper / Code	Date
1	DETR	0.975	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
2	DiT-L	0.972	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
3	VGT	0.971	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
4	DoPTA	0.970	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
5	LayoutLMv3-B	0.970	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
6	ResNext-101-32×8d	0.968	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
7	TRDLU	0.966	Transformer-based Approach for Document Understanding	Oct 2022
8	VSR	0.964	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
9	UDoc	0.964	Unified Pretraining Framework for Document Understanding	Apr 2022
10	DeiT-BOpen Source Meta	0.957	Training data-efficient image transformers & distillation through attention Code	Dec 2020
11	BEiT-B	0.957	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
12	Mask RCNN	0.949	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
13	Faster_RCNN	0.937	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	GLAM	0.206	A Graphical Approach to Document Layout Analysis Code	Aug 2023

list

#	Model	Score	Paper / Code	Date
1	TRDLU	0.975	Transformer-based Approach for Document Understanding	Oct 2022
2	VGT	0.968	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
3	DETR	0.964	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
4	DiT-L	0.960	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
5	DoPTA	0.957	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
6	LayoutLMv3-B	0.955	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
7	VSR	0.947	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
8	ResNext-101-32×8d	0.940	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
9	UDoc	0.937	Unified Pretraining Framework for Document Understanding	Apr 2022
10	BEiT-B	0.924	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
11	DeiT-BOpen Source Meta	0.921	Training data-efficient image transformers & distillation through attention Code	Dec 2020
12	Mask RCNN	0.886	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
13	Faster_RCNN	0.883	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	GLAM	0.862	A Graphical Approach to Document Layout Analysis Code	Aug 2023

overall

#	Model	Score	Paper / Code	Date
1	VGT	0.962	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
2	TRDLU	0.959	Transformer-based Approach for Document Understanding	Oct 2022
3	VSR	0.957	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
4	DETR	0.957	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
5	LayoutLMv3-B	0.951	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
6	DiT-L	0.949	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
7	DoPTA	0.949	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
8	UDoc	0.939	Unified Pretraining Framework for Document Understanding	Apr 2022
9	ResNext-101-32×8d	0.935	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
10	DeiT-BOpen Source Meta	0.932	Training data-efficient image transformers & distillation through attention Code	Dec 2020
11	BEiT-B	0.931	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
12	Mask RCNN	0.910	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
13	Faster_RCNN	0.902	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	GLAM	0.722	A Graphical Approach to Document Layout Analysis Code	Aug 2023

table

#	Model	Score	Paper / Code	Date
1	DETR	0.981	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
2	VGT	0.981	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
3	LayoutLMv3-B	0.979	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
4	CDeC-Net	0.978	CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images Code	Aug 2020
5	DiT-L	0.978	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
6	DoPTA	0.977	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
7	ResNext-101-32×8d	0.976	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
8	TRDLU	0.976	Transformer-based Approach for Document Understanding	Oct 2022
9	VSR	0.974	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
10	UDoc	0.973	Unified Pretraining Framework for Document Understanding	Apr 2022
11	BEiT-B	0.973	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
12	DeiT-BOpen Source Meta	0.972	Training data-efficient image transformers & distillation through attention Code	Dec 2020
13	Mask RCNN	0.960	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	Faster_RCNN	0.954	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
15	GLAM	0.868	A Graphical Approach to Document Layout Analysis Code	Aug 2023

text

#	Model	Score	Paper / Code	Date
1	VSR	0.967	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
2	TRDLU	0.958	Transformer-based Approach for Document Understanding	Oct 2022
3	VGT	0.950	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
4	DETR	0.947	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
5	LayoutLMv3-B	0.945	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
6	DoPTA	0.944	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
7	DiT-L	0.944	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
8	UDoc	0.939	Unified Pretraining Framework for Document Understanding	Apr 2022
9	BEiT-B	0.934	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
10	DeiT-BOpen Source Meta	0.934	Training data-efficient image transformers & distillation through attention Code	Dec 2020
11	ResNext-101-32×8d	0.930	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
12	Mask RCNN	0.916	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
13	Faster_RCNN	0.910	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	GLAM	0.878	A Graphical Approach to Document Layout Analysis Code	Aug 2023

title

#	Model	Score	Paper / Code	Date
1	VGT	0.939	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
2	VSR	0.931	VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Code	May 2021
3	TRDLU	0.921	Transformer-based Approach for Document Understanding	Oct 2022
4	DETR	0.918	Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 2023
5	LayoutLMv3-B	0.906	LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Code	Apr 2022
6	DoPTA	0.895	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Dec 2024
7	DiT-L	0.893	DiT: Self-supervised Pre-training for Document Image Transformer Code	Mar 2022
8	UDoc	0.885	Unified Pretraining Framework for Document Understanding	Apr 2022
9	DeiT-BOpen Source Meta	0.874	Training data-efficient image transformers & distillation through attention Code	Dec 2020
10	BEiT-B	0.866	BEiT: BERT Pre-Training of Image Transformers Code	Jun 2021
11	ResNext-101-32×8d	0.862	Vision Grid Transformer for Document Layout Analysis Code	Aug 2023
12	Mask RCNN	0.840	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
13	Faster_RCNN	0.826	PubLayNet: largest dataset ever for document layout analysis Code	Aug 2019
14	GLAM	0.800	A Graphical Approach to Document Layout Analysis Code	Aug 2023

Related Papers12

DoPTA: Improving Document Layout Analysis using Patch-Text Alignment

Dec 2024Models: DoPTA

Vision Grid Transformer for Document Layout Analysis

Aug 2023Models: VGT, ResNext-101-32×8d

A Graphical Approach to Document Layout Analysis

Aug 2023Models: GLAM

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

Jun 2023Models: DETR

Unified Pretraining Framework for Document Understanding

Apr 2022Models: UDoc

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Apr 2022Models: LayoutLMv3-B

DiT: Self-supervised Pre-training for Document Image Transformer

Mar 2022Models: DiT-L

BEiT: BERT Pre-Training of Image Transformers

Jun 2021Models: BEiT-B

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

May 2021Models: VSR

Training data-efficient image transformers & distillation through attention

Dec 2020Models: DeiT-B

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Aug 2020Models: CDeC-Net

PubLayNet: largest dataset ever for document layout analysis

Aug 2019Models: Mask RCNN, Faster_RCNN

Other Document Layout Analysis Datasets

d4la u-diads-bib document-layout-recognition-challenge-mini-dev document-layout-recognition-challenge-test

publaynet-val Benchmark - Document Layout Analysis | CodeSOTA