Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Scene Text RecognitionHome/Tasks/Computer Vision/Scene Text Recognition

Scene Text Recognition.

Recognizing text in natural scene images

11
Datasets
127
Results
accuracy
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

cute80

Dataset from Papers With Code

Primary metric: accuracy
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on cute80.

#ModelaccuracyYearSource
CPPD99.72023paper ↗
2CLIP4STR-L (DataComp-1B)99.72023paper ↗
3MGP-STR99.32022paper ↗
4CLIP4STR-B99.32023paper ↗
5DTrOCR 105M99.12023paper ↗
6CLIP4STR-L99.02023paper ↗
7PARSeq98.62026paper ↗
8CCD-ViT-Small(ARD_2.8M)98.32022paper ↗
9CCD-ViT-Base(ARD_2.8M)98.32022paper ↗
10CCD-ViT-Tiny(ARD_2.8M)95.82022paper ↗

What were you looking for on Scene Text Recognition?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

11 datasets tracked for this task.

cute80
CANONICAL
20 results · accuracy
Top: CPPD 99.7
svt
40 results · accuracy
Top: CLIP4STR-H (DFN-5B) 99.1
iiit5k
21 results · accuracy
Top: CLIP4STR-L (DataComp-1B) 99.6
svtp
19 results · accuracy
Top: DTrOCR 105M 98.6
icdar-2003
12 results · accuracy
Top: Yet Another Text Recognizer 97.1
wost
5 results · accuracy
Top: CLIP4STR-H (DFN-5B) 90.9
host
3 results · accuracy
Top: CLIP4STR-L 82.7
uber-text
3 results · accuracy
Top: CLIP4STR-L (DataComp-1B) 92.2
msda
2 results · accuracy
Top: MetaSelf-Learning 42.0
ic13
1 result · accuracy
Top: ABINet-LV+TPS++ 97.8
svt-p
1 result · accuracy
Top: ABINet-LV+TPS++ 89.6
§ 05 · Related tasks

Other tasks in Computer Vision.

Document Image ClassificationDocument Layout AnalysisDocument ParsingDocument UnderstandingGeneral OCR CapabilitiesHandwriting RecognitionImage Feature ExtractionImage-to-3D
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Scene Text Recognition? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.