Scene Text Recognition2020en
svtp
Dataset from Papers With Code
Metrics:accuracy, cer, wer, f1
Current State of the Art
DTrOCR 105M
Unknown
98.6
accuracy
accuracy Progress Over Time
Showing 4 breakthroughs from Aug 2021 to Aug 2023
Key Milestones
Aug 2021
DPAN
From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
89.0
Nov 2021
MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
90.6
+1.8%
Aug 2023
DTrOCR 105MCurrent SOTA
From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition
98.6
+0.3%
Total Improvement
10.8%
Time Span
2y
Breakthroughs
4
Current SOTA
98.6
Top Models Performance Comparison
Top 10 models ranked by accuracy
Best Score
98.6
Top Model
DTrOCR 105M
Models Compared
10
Score Range
8.0
accuracyPrimary
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | DTrOCR 105M | 98.6 | Aug 2023 | |
| 2 | MGP-STR | 98.3 | Sep 2022 | |
| 3 | CLIP4STR-L (DataComp-1B) | 98.1 | May 2023 | |
| 4 | CLIP4STR-L | 97.4 | May 2023 | |
| 5 | CLIP4STR-B* | 97.2 | May 2023 | |
| 6 | CPPD | 96.7 | Jul 2023 | |
| 7 | CCD-ViT-Base | 96.1 | Nov 2022 | |
| 8 | CCD-ViT-Small | 92.7 | Nov 2022 | |
| 9 | CCD-ViT-Tiny | 91.6 | Nov 2022 | |
| 10 | S-GTR | 90.6 | Dec 2021 | |
| 11 | MATRN | 90.6 | Nov 2021 | |
| 12 | SIGA_T | 90.5 | Mar 2022 | |
| 13 | CDistNet (Ours) | 89.77 | Nov 2021 | |
| 14 | DiffusionSTR | 89.2 | Jun 2023 | |
| 15 | DPAN | 89 | Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text RecognitionCode | Aug 2021 |
Related Papers10
DTrOCR: Decoder-only Transformer for Optical Character Recognition
Aug 2023Models: DTrOCR 105M
Context Perception Parallel Decoder for Scene Text Recognition
Jul 2023Models: CPPD
DiffusionSTR: Diffusion Model for Scene Text Recognition
Jun 2023Models: DiffusionSTR
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
May 2023Models: CLIP4STR-L (DataComp-1B), CLIP4STR-L, CLIP4STR-B*
Self-supervised Character-to-Character Distillation for Text Recognition
Nov 2022Models: CCD-ViT-Base, CCD-ViT-Small, CCD-ViT-Tiny
Multi-Granularity Prediction for Scene Text Recognition
Sep 2022Models: MGP-STR
Self-supervised Implicit Glyph Attention for Text Recognition
Mar 2022Models: SIGA_T
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
Nov 2021Models: CDistNet (Ours)