Document Image Classification
Classifying documents by type or category
Document Image Classification is a key task in computer vision. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.
Benchmarks & SOTA
rvl-cdip
Dataset from Papers With Code
State of the Art
EAML
97.7
accuracy
tobacco-3482
Dataset from Papers With Code
State of the Art
DocXClassifier-L
95.57
accuracy
noisy-bangla-numeral
Dataset from Papers With Code
State of the Art
PCGAN-CHAR
96.68
accuracy
noisy-bangla-characters
Dataset from Papers With Code
State of the Art
PCGAN-CHAR
89.54
accuracy
noisy-mnist
Dataset from Papers With Code
State of the Art
PCGAN-CHAR
98.43
accuracy
aip
Dataset from Papers With Code
State of the Art
ResNet-RS (ResNet-200 + RS training tricks)
83.4
top-1-accuracy-verb
n-mnist
Dataset from Papers With Code
State of the Art
Pixel-level RC
97.62
accuracy
Related Tasks
General OCR Capabilities
Comprehensive benchmarks covering multiple aspects of OCR performance.
Polish OCR
OCR for Polish language including historical documents, gothic fonts, and diacritic recognition.
Image Classification
Categorizing images into predefined classes (ImageNet, CIFAR).
Object Detection
Locating and classifying objects in images (COCO, Pascal VOC).