Document Image Classification2020en

rvl-cdip

Dataset from Papers With Code

Metrics:accuracy, cer, wer, f1
Current State of the Art

EAML

Unknown

97.7

accuracy

accuracy Progress Over Time

Showing 5 breakthroughs from Apr 2017 to May 2023

90.392.394.396.498.4Apr 2017Oct 2018Apr 2020Oct 2021May 2023accuracyDate

Key Milestones

Apr 2017
Transfer Learning from AlexNet, VGG-16, GoogLeNet and ResNet50

From paper: Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

91.0
Jan 2018
Transfer Learning from VGG16 trained on Imagenet

From paper: Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

92.2
+1.4%
Dec 2019
Pre-trained LayoutLM

From paper: LayoutLM: Pre-training of Text and Layout for Document Image Understanding

94.4
+2.4%
Jun 2020
Cross-Modal

From paper: Visual and Textual Deep Feature Fusion for Document Image Classification

97.0
+2.8%
May 2023
EAMLCurrent SOTA

From paper: EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification

97.7
+0.7%
Total Improvement
7.4%
Time Span
6y 2m
Breakthroughs
5
Current SOTA
97.7

Top Models Performance Comparison

Top 10 models ranked by accuracy

accuracy1EAML97.7100.0%2Cross-Modal97.099.3%3DocFormerBASE96.298.4%4LayoutLMV3Large95.998.2%5LiLT[EN-R]BASE95.797.9%6LayoutLMv2LARGE95.697.9%7TILT-Large95.597.8%8DocFormer large95.597.7%9LayoutLMv3BASE95.497.7%10Donut95.397.5%0%25%50%75%100%% of best
Best Score
97.7
Top Model
EAML
Models Compared
10
Score Range
2.4

accuracyPrimary

#ModelScorePaper / CodeDate
1
EAML
97.7May 2023
2
Cross-Modal
97.05
Visual and Textual Deep Feature Fusion for Document Image Classification
Jun 2020
3
DocFormerBASE
96.17Jun 2021
4
LayoutLMV3Large
95.93Apr 2022
5
LiLT[EN-R]BASE
95.68Feb 2022
6
LayoutLMv2LARGE
95.64Dec 2020
7
TILT-Large
95.52Feb 2021
8
DocFormer large
95.5Jun 2021
9
LayoutLMv3BASE
95.44Apr 2022
10
Donut
95.3Nov 2021
11
TILT-Base
95.25Feb 2021
12
LayoutLMv2BASE
95.25Dec 2020
13
LayoutXLM
95.21Apr 2021
14
StrucTexTv2 (large)
94.62Mar 2023
15
Pre-trained LayoutLM
94.42Dec 2019
16
DoPTA
94.12Dec 2024
17
DocXClassifier-B
94
DocXClassifier: High Performance Explainable Deep Network for Document Image ClassificationCode
Mar 2022
18
StrucTexTv2 (small)
93.4Mar 2023
19
VLCDoC
93.19May 2022
20
TransferDoc
93.18Sep 2023
21
Multimodal (ResNet50)
92.7Jan 2023
22
DiT-L
92.69Mar 2022
23
Pre-trained EfficientNet
92.31Jun 2020
24
Transfer Learning from VGG16 trained on Imagenet
92.21Jan 2018
25
Multimodal (MobileNetV2)
92.2Jan 2023
26
DiT-B
92.11Mar 2022
27
BEiT-B
91.09Jun 2021
28
Transfer Learning from AlexNet, VGG-16, GoogLeNet and ResNet50
90.97Apr 2017
29
AlexNet + spatial pyramidal pooling + image resizing
90.94Aug 2017
30
DeiT-BOpen Source
Meta
90.32Dec 2020
31
Roberta base
90.06Jul 2019

far

war

Related Papers23

Multimodal Side-Tuning for Document Classification
Jan 2023Models: Multimodal (ResNet50), Multimodal (MobileNetV2)
DocFormer: End-to-End Transformer for Document Understanding
Jun 2021Models: DocFormerBASE, DocFormer large
Analysis of Convolutional Neural Networks for Document Image Classification
Aug 2017Models: AlexNet + spatial pyramidal pooling + image resizing

Other Document Image Classification Datasets