Codesota · Benchmark · publaynet-valHome/Leaderboards/Vision & Documents/Document Layout Analysis/publaynet-val
Unknown

publaynet-val.

publaynet-val is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for publaynet-val.

Paper Leaderboard
§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Table

Table is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Tableverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. Table AP 98.6% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.
unverified0.992024Source ↗Looks wrong?

Figure

Figure is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Figureverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. Figure AP 98.5% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.
unverified0.982024Source ↗Looks wrong?

Table

Table is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Tableverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.982023Paper ↗Looks wrong?
02VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.982023Paper ↗Code ↗Looks wrong?
03LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.982022Paper ↗Code ↗Looks wrong?
04CDeC-Net
From paper: CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
verified0.982020Paper ↗Code ↗Looks wrong?
05DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.982022Paper ↗Code ↗Looks wrong?
06DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.982024Paper ↗Looks wrong?
07ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.982023Paper ↗Code ↗Looks wrong?
08TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.982022Paper ↗Looks wrong?
09VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.972021Paper ↗Code ↗Looks wrong?
10BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.972021Paper ↗Code ↗Looks wrong?
11UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.972022Paper ↗Looks wrong?
12DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.972020Paper ↗Code ↗Looks wrong?
13Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.962019Paper ↗Code ↗Looks wrong?
14Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.952019Paper ↗Code ↗Looks wrong?
15GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.872023Paper ↗Code ↗Looks wrong?

Text

Text is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Textverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. Text AP 98.0% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.
unverified0.982024Source ↗Looks wrong?

Figure

Figure is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Figureverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.972023Paper ↗Looks wrong?
02DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.972022Paper ↗Code ↗Looks wrong?
03VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.972023Paper ↗Code ↗Looks wrong?
04DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.972024Paper ↗Looks wrong?
05LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.972022Paper ↗Code ↗Looks wrong?
06ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.972023Paper ↗Code ↗Looks wrong?
07TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.972022Paper ↗Looks wrong?
08VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.962021Paper ↗Code ↗Looks wrong?
09UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.962022Paper ↗Looks wrong?
10DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.962020Paper ↗Code ↗Looks wrong?
11BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.962021Paper ↗Code ↗Looks wrong?
12Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.952019Paper ↗Code ↗Looks wrong?
13Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.942019Paper ↗Code ↗Looks wrong?
14GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.212023Paper ↗Code ↗Looks wrong?

List

List is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Listverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.972022Paper ↗Looks wrong?
02VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.972023Paper ↗Code ↗Looks wrong?
03DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.962023Paper ↗Looks wrong?
04DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.962022Paper ↗Code ↗Looks wrong?
05DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.962024Paper ↗Looks wrong?
06LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.952022Paper ↗Code ↗Looks wrong?
07VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.952021Paper ↗Code ↗Looks wrong?
08ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.942023Paper ↗Code ↗Looks wrong?
09UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.942022Paper ↗Looks wrong?
10BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.922021Paper ↗Code ↗Looks wrong?
11DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.922020Paper ↗Code ↗Looks wrong?
12Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.892019Paper ↗Code ↗Looks wrong?
13Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.882019Paper ↗Code ↗Looks wrong?
14GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.862023Paper ↗Code ↗Looks wrong?

Overall

Overall is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Overallverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. Overall mAP 97.3% on PubLayNet-val. Best result as of ICDAR 2024. Query encoding + hybrid matching. arXiv 2404.17888.
unverified0.972024Source ↗Looks wrong?
02RoDLA
RoDLA. 96.0 overall mAP on clean PubLayNet-val. DINO + InternImage backbone. Robustness benchmark paper. CVPR 2024. arXiv 2403.14442.
unverified0.962024Source ↗Looks wrong?

List

List is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Listverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. List AP 97.3% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.
unverified0.972024Source ↗Looks wrong?

Text

Text is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Textverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.972021Paper ↗Code ↗Looks wrong?
02TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.962022Paper ↗Looks wrong?
03VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.952023Paper ↗Code ↗Looks wrong?
04DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.952023Paper ↗Looks wrong?
05LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.942022Paper ↗Code ↗Looks wrong?
06DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.942024Paper ↗Looks wrong?
07DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.942022Paper ↗Code ↗Looks wrong?
08UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.942022Paper ↗Looks wrong?
09BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.932021Paper ↗Code ↗Looks wrong?
10DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.932020Paper ↗Code ↗Looks wrong?
11ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.932023Paper ↗Code ↗Looks wrong?
12Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.922019Paper ↗Code ↗Looks wrong?
13Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.912019Paper ↗Code ↗Looks wrong?
14GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.882023Paper ↗Code ↗Looks wrong?

Overall

Overall is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Overallverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.962023Paper ↗Code ↗Looks wrong?
02TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.962022Paper ↗Looks wrong?
03VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.962021Paper ↗Code ↗Looks wrong?
04DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.962023Paper ↗Looks wrong?
05LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.952022Paper ↗Code ↗Looks wrong?
06DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.952022Paper ↗Code ↗Looks wrong?
07DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.952024Paper ↗Looks wrong?
08UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.942022Paper ↗Looks wrong?
09ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.942023Paper ↗Code ↗Looks wrong?
10DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.932020Paper ↗Code ↗Looks wrong?
11BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.932021Paper ↗Code ↗Looks wrong?
12Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.912019Paper ↗Code ↗Looks wrong?
13Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.902019Paper ↗Code ↗Looks wrong?
14GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.722023Paper ↗Code ↗Looks wrong?

Title

Title is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Titleverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Hybrid DLA (Shehzadi et al.)
Hybrid Approach for DLA. Title AP 94.2% on PubLayNet-val. ICDAR 2024. arXiv 2404.17888.
unverified0.942024Source ↗Looks wrong?

Title

Title is the reported evaluation metric for publaynet-val. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Titleverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01VGT
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.942023Paper ↗Code ↗Looks wrong?
02VSR
From paper: VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
verified0.932021Paper ↗Code ↗Looks wrong?
03TRDLU
From paper: Transformer-based Approach for Document Understanding
verified0.922022Paper ↗Looks wrong?
04DETR
From paper: Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
verified0.922023Paper ↗Looks wrong?
05LayoutLMv3-B
From paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
verified0.912022Paper ↗Code ↗Looks wrong?
06DoPTA
From paper: DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
verified0.902024Paper ↗Looks wrong?
07DiT-L
From paper: DiT: Self-supervised Pre-training for Document Image Transformer
verified0.892022Paper ↗Code ↗Looks wrong?
08UDoc
From paper: Unified Pretraining Framework for Document Understanding
verified0.892022Paper ↗Looks wrong?
09DeiT-B
From paper: Training data-efficient image transformers & distillation through attention
verified0.872020Paper ↗Code ↗Looks wrong?
10BEiT-B
From paper: BEiT: BERT Pre-Training of Image Transformers
verified0.872021Paper ↗Code ↗Looks wrong?
11ResNext-101-32×8d
From paper: Vision Grid Transformer for Document Layout Analysis
verified0.862023Paper ↗Code ↗Looks wrong?
12Mask R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.842019Paper ↗Code ↗Looks wrong?
13Faster R-CNN
From paper: PubLayNet: largest dataset ever for document layout analysis
verified0.832019Paper ↗Code ↗Looks wrong?
14GLAM
From paper: A Graphical Approach to Document Layout Analysis
verified0.802023Paper ↗Code ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Document Layout Analysis