Scene Text Recognition2020en
svtp
Dataset from Papers With Code
Metrics:accuracy, cer, wer, f1
Current State of the Art
DTrOCR 105M
Unknown
98.6
accuracy
accuracy Progress Over Time
Showing 5 breakthroughs from Mar 2021 to Aug 2023
Key Milestones
Nov 2021
MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
90.6
+1.2%
Aug 2023
DTrOCR 105MCurrent SOTA
From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition
98.6
+0.3%
Total Improvement
10.2%
Time Span
2y 5m
Breakthroughs
5
Current SOTA
98.6
Top Models Performance Comparison
Top 10 models ranked by accuracy
Best Score
98.6
Top Model
DTrOCR 105M
Models Compared
10
Score Range
7.0
accuracyPrimary
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | DTrOCR 105M | 98.6 | Aug 2023 | |
| 2 | MGP-STR | 98.3 | Sep 2022 | |
| 3 | CLIP4STR-L (DataComp-1B) | 98.1 | May 2023 | |
| 4 | CLIP4STR-L | 97.4 | May 2023 | |
| 5 | CLIP4STR-B Research | 97.2 | May 2023 | |
| 6 | PARSeqOpen Source Research | 96.9 | Jul 2022 | |
| 7 | CPPD | 96.7 | Jul 2023 | |
| 8 | CCD-ViT-Base | 96.1 | Nov 2022 | |
| 9 | CCD-ViT-Small | 92.7 | Nov 2022 | |
| 10 | CCD-ViT-Tiny | 91.6 | Nov 2022 | |
| 11 | S-GTR | 90.6 | Dec 2021 | |
| 12 | MATRN Research | 90.6 | Nov 2021 | |
| 13 | SIGA_T | 90.5 | Mar 2022 | |
| 14 | CDistNet (Ours) | 89.77 | Nov 2021 | |
| 15 | ABINet-LVOpen Source Fang et al. | 89.5 | Mar 2021 | |
| 16 | DiffusionSTR | 89.2 | Jun 2023 | |
| 17 | DPAN | 89 | Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text RecognitionCode | Aug 2021 |
| 18 | TrOCR-large 558M | 88.1 | Sep 2021 | |
| 19 | TrOCR-base 334M | 86.9 | Sep 2021 |
Related Papers13
DTrOCR: Decoder-only Transformer for Optical Character Recognition
Aug 2023Models: DTrOCR 105M
Context Perception Parallel Decoder for Scene Text Recognition
Jul 2023Models: CPPD
DiffusionSTR: Diffusion Model for Scene Text Recognition
Jun 2023Models: DiffusionSTR
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
May 2023Models: CLIP4STR-L (DataComp-1B), CLIP4STR-L, CLIP4STR-B
Self-supervised Character-to-Character Distillation for Text Recognition
Nov 2022Models: CCD-ViT-Base, CCD-ViT-Small, CCD-ViT-Tiny
Multi-Granularity Prediction for Scene Text Recognition
Sep 2022Models: MGP-STR
Scene Text Recognition with Permuted Autoregressive Sequence Models
Jul 2022Models: PARSeq
Self-supervised Implicit Glyph Attention for Text Recognition
Mar 2022Models: SIGA_T
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
Nov 2021Models: CDistNet (Ours)
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Sep 2021Models: TrOCR-large 558M, TrOCR-base 334M