Scene Text Recognition2020en
iiit5k
Dataset from Papers With Code
Metrics:accuracy, cer, wer, f1
Current State of the Art
CLIP4STR-L (DataComp-1B)
Unknown
99.6
accuracy
accuracy Progress Over Time
Showing 5 breakthroughs from Aug 2021 to May 2023
Key Milestones
Aug 2021
DPAN
From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
96.2
Nov 2021
MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
96.6
+0.4%
Dec 2021
S-GTR
From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
97.5
+0.9%
May 2023
CLIP4STR-L (DataComp-1B)Current SOTA
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
99.6
+0.8%
Total Improvement
3.5%
Time Span
1y 9m
Breakthroughs
5
Current SOTA
99.6
Top Models Performance Comparison
Top 10 models ranked by accuracy
Best Score
99.6
Top Model
CLIP4STR-L (DataC...
Models Compared
10
Score Range
2.1
accuracyPrimary
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | CLIP4STR-L (DataComp-1B) | 99.6 | May 2023 | |
| 2 | DTrOCR 105M | 99.6 | Aug 2023 | |
| 3 | CLIP4STR-L | 99.5 | May 2023 | |
| 4 | CLIP4STR-B (DataComp-1B) | 99.5 | May 2023 | |
| 5 | CPPD | 99.3 | Jul 2023 | |
| 6 | CLIP4STR-B* | 99.2 | May 2023 | |
| 7 | MGP-STR | 98.8 | Sep 2022 | |
| 8 | CCD-ViT-Small(ARD_2.8M) | 98 | Nov 2022 | |
| 9 | CCD-ViT-Base(ARD_2.8M) | 98 | Nov 2022 | |
| 10 | S-GTR | 97.5 | Dec 2021 | |
| 11 | DiffusionSTR | 97.3 | Jun 2023 | |
| 12 | CCD-ViT-Tiny(ARD_2.8M) | 97.1 | Nov 2022 | |
| 13 | SIGA_S | 96.9 | Mar 2022 | |
| 14 | MATRN | 96.6 | Nov 2021 | |
| 15 | CDistNet (Ours) | 96.57 | Nov 2021 | |
| 16 | DPAN | 96.2 | Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text RecognitionCode | Aug 2021 |
Related Papers10
DTrOCR: Decoder-only Transformer for Optical Character Recognition
Aug 2023Models: DTrOCR 105M
Context Perception Parallel Decoder for Scene Text Recognition
Jul 2023Models: CPPD
DiffusionSTR: Diffusion Model for Scene Text Recognition
Jun 2023Models: DiffusionSTR
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
May 2023Models: CLIP4STR-L (DataComp-1B), CLIP4STR-L, CLIP4STR-B (DataComp-1B) +1 more
Self-supervised Character-to-Character Distillation for Text Recognition
Nov 2022Models: CCD-ViT-Small(ARD_2.8M), CCD-ViT-Base(ARD_2.8M), CCD-ViT-Tiny(ARD_2.8M)
Multi-Granularity Prediction for Scene Text Recognition
Sep 2022Models: MGP-STR
Self-supervised Implicit Glyph Attention for Text Recognition
Mar 2022Models: SIGA_S
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
Nov 2021Models: CDistNet (Ours)