Computer Vision

Scene Text Recognition

Recognizing text in natural scene images

11 datasets109 results

Scene Text Recognition is a key task in computer vision. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.

Benchmarks & SOTA

svt

202035 results

Dataset from Papers With Code

State of the Art

CLIP4STR-H (DFN-5B)

99.1

accuracy

cute80

202016 results

Dataset from Papers With Code

State of the Art

CPPD

99.7

accuracy

iiit5k

202016 results

Dataset from Papers With Code

State of the Art

DTrOCR 105M

99.6

accuracy

svtp

202015 results

Dataset from Papers With Code

State of the Art

DTrOCR 105M

98.6

accuracy

icdar-2003

202012 results

Dataset from Papers With Code

State of the Art

Yet Another Text Recognizer

97.1

accuracy

wost

20205 results

Dataset from Papers With Code

State of the Art

CLIP4STR-H (DFN-5B)

90.9

1-1-accuracy

uber-text

20203 results

Dataset from Papers With Code

State of the Art

CLIP4STR-L (DataComp-1B)

92.2

accuracy

host

20203 results

Dataset from Papers With Code

State of the Art

CLIP4STR-L

82.7

1-1-accuracy

msda

20202 results

Dataset from Papers With Code

State of the Art

MetaSelf-Learning

accuracy

svt-p

20201 results

Dataset from Papers With Code

State of the Art

ABINet-LV+TPS++

89.6

accuracy

ic13

20201 results

Dataset from Papers With Code

State of the Art

ABINet-LV+TPS++

97.8

accuracy

Related Tasks

General OCR Capabilities

Comprehensive benchmarks covering multiple aspects of OCR performance.

Polish OCR

OCR for Polish language including historical documents, gothic fonts, and diacritic recognition.

Image Classification

Categorizing images into predefined classes (ImageNet, CIFAR).

Object Detection

Locating and classifying objects in images (COCO, Pascal VOC).

Back to Computer Vision

Scene Text Recognition Benchmarks - Computer Vision - CodeSOTA | CodeSOTA