Scene Text Recognition2020en

svt

Dataset from Papers With Code

Metrics:accuracy, cer, wer, f1

Current State of the Art

CLIP4STR-H (DFN-5B)

Unknown

99.1

accuracy

accuracy Progress Over Time

Showing 13 breakthroughs from Jun 2014 to May 2023

Key Milestones

Jun 2014

CHAR

From paper: Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition

68.0

Source

Jul 2015

CRNN

From paper: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

80.8

+18.8%

Source

Mar 2016

RARE

From paper: Robust Scene Text Recognition with Automatic Rectification

81.9

+1.4%

Source

Sep 2016

STAR-Net

From paper: Star-net: A spatial attention residue network for scene text recognition.

83.6

+2.1%

Source

Jun 2018

ASTER

From paper: ASTER: An Attentional Scene Text Recognizer with Flexible Rectification

89.5

+7.1%

Source

Oct 2019

SATRN

From paper: On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

91.3

+2.0%

Source

Mar 2020

SRN

From paper: Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

91.5

+0.2%

Source

Jun 2021

RCEED

From paper: Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

91.8

+0.3%

Source

Jul 2021

Yet Another Text Recognizer

From paper: Why You Should Try the Real Data for the Scene Text Recognition

94.7

+3.2%

Source

Nov 2021

MATRN

From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features

95.0

+0.3%

Source

Dec 2021

S-GTR

From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

95.8

+0.8%

Source

Sep 2022

MGP-STR

From paper: Multi-Granularity Prediction for Scene Text Recognition

98.6

+2.9%

Source

May 2023

CLIP4STR-H (DFN-5B)Current SOTA

From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model

99.1

+0.5%

Source

Total Improvement

45.7%

Time Span

9y 1m

Breakthroughs

Current SOTA

99.1

Top Models Performance Comparison

Top 10 models ranked by accuracy

Best Score

99.1

Top Model

CLIP4STR-H (DFN-5B)

Models Compared

Score Range

3.1

accuracyPrimary

#	Model	Score	Paper / Code	Date
1	CLIP4STR-H (DFN-5B)	99.1	CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model Code	May 2023
2	DTrOCR 105M	98.9	DTrOCR: Decoder-only Transformer for Optical Character Recognition Code	Aug 2023
3	MGP-STR	98.6	Multi-Granularity Prediction for Scene Text Recognition Code	Sep 2022
4	CLIP4STR-L (DataComp-1B)	98.6	CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model Code	May 2023
5	CPPD	98.5	Context Perception Parallel Decoder for Scene Text Recognition Code	Jul 2023
6	CLIP4STR-L	98.5	CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model Code	May 2023
7	CLIP4STR-B*	98.3	CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model Code	May 2023
8	CCD-ViT-Base(ARD_2.8M)	97.8	Self-supervised Character-to-Character Distillation for Text Recognition Code	Nov 2022
9	CCD-ViT-Small(ARD_2.8M)	96.4	Self-supervised Character-to-Character Distillation for Text Recognition Code	Nov 2022
10	CCD-ViT-Tiny(ARD_2.8M)	96	Self-supervised Character-to-Character Distillation for Text Recognition Code	Nov 2022
11	S-GTR	95.8	Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition Code	Dec 2021
12	SIGA_T	95.1	Self-supervised Implicit Glyph Attention for Text Recognition Code	Mar 2022
13	MATRN	95	Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features Code	Nov 2021
14	Yet Another Text Recognizer	94.7	Why You Should Try the Real Data for the Scene Text Recognition Code	Jul 2021
15	NRTR+TPS++	94.6	TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition Code	May 2023
16	DPAN	93.9	Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text RecognitionCode	Aug 2021
17	CDistNet (Ours)	93.82	CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition Code	Nov 2021
18	DiffusionSTR	93.6	DiffusionSTR: Diffusion Model for Scene Text Recognition	Jun 2023
19	RCEED	91.8	Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition Code	Jun 2021
20	SRN	91.5	Towards Accurate Scene Text Recognition with Semantic Reasoning Networks Code	Mar 2020
21	SATRN	91.3	On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention Code	Oct 2019
22	CSTR	90.6	Revisiting Classification Perspective on Scene Text Recognition Code	Feb 2021
23	TextScanner	90.1	TextScanner: Reading Characters in Order for Robust Scene Text Recognition	Dec 2019
24	SEED	89.6	SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition Code	May 2020
25	ASTER	89.5	ASTER: An Attentional Scene Text Recognizer with Flexible RectificationCode	Jun 2018
26	DAN	89.2	Decoupled Attention Network for Text Recognition Code	Dec 2019
27	SAFL	88.6	SAFL: A Self-Attention Scene Text Recognizer with Focal Loss Code	Jan 2022
28	ViTSTR	87.7	Vision Transformer for Fast and Efficient Scene Text Recognition Code	May 2021
29	Baek et al.	87.5	What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis Code	Apr 2019
30	CA-FCN	86.4	Scene Text Recognition from Two-Dimensional Perspective	Sep 2018
31	SAR	84.5	Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition Code	Nov 2018
32	STAR-Net	83.6	Star-net: A spatial attention residue network for scene text recognition.Code	Sep 2016
33	RARE	81.9	Robust Scene Text Recognition with Automatic Rectification Code	Mar 2016
34	CRNN	80.8	An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition Code	Jul 2015
35	CHAR	68	Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition Code	Jun 2014