Codesota · Benchmark · svtHome/Leaderboards/Vision & Documents/Scene Text Recognition/svt
Unknown

svt.

svt is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for svt.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Accuracy

Accuracy is the reported evaluation metric for svt. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01CLIP4STR-H (DFN-5B)
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified99.12023Paper ↗Code ↗Looks wrong?
02DTrOCR 105M
From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition
verified98.92023Paper ↗Code ↗Looks wrong?
03MGP-STR
From paper: Multi-Granularity Prediction for Scene Text Recognition
verified98.62022Paper ↗Code ↗Looks wrong?
04CLIP4STR-L (DataComp-1B)
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified98.62023Paper ↗Code ↗Looks wrong?
05CPPD
From paper: Context Perception Parallel Decoder for Scene Text Recognition
verified98.52023Paper ↗Code ↗Looks wrong?
06CLIP4STR-L
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified98.52023Paper ↗Code ↗Looks wrong?
07CLIP4STR-B
From paper: An Empirical Study of Scaling Law for OCR
verified98.32023Paper ↗Code ↗Source ↗Looks wrong?
08PARSeq
Lowercase alphanum eval. ECCV 2022.
verified97.842022Paper ↗Looks wrong?
09CCD-ViT-Base(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified97.82022Paper ↗Code ↗Looks wrong?
10CCD-ViT-Small(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified96.42022Paper ↗Code ↗Looks wrong?
11TrOCR-large 558M
TrOCR-large, Syn+Benchmark training. Table 6. AAAI 2023.
verified96.12021Paper ↗Looks wrong?
12CCD-ViT-Tiny(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified962022Paper ↗Code ↗Looks wrong?
13S-GTR
From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
verified95.82021Paper ↗Code ↗Looks wrong?
14TrOCR-base 334M
TrOCR-base, Syn+Benchmark training. Table 6. AAAI 2023.
verified95.22021Paper ↗Looks wrong?
15SIGA_T
From paper: Self-supervised Implicit Glyph Attention for Text Recognition
verified95.12022Paper ↗Code ↗Looks wrong?
16MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
verified952021Paper ↗Code ↗Looks wrong?
17Yet Another Text Recognizer
From paper: Why You Should Try the Real Data for the Scene Text Recognition
verified94.72021Paper ↗Code ↗Looks wrong?
18NRTR+TPS++
From paper: TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
verified94.62023Paper ↗Code ↗Looks wrong?
19DPAN
From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
verified93.92021Paper ↗Code ↗Looks wrong?
20CDistNet (Ours)
From paper: CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
verified93.822021Paper ↗Code ↗Looks wrong?
21DiffusionSTR
From paper: DiffusionSTR: Diffusion Model for Scene Text Recognition
verified93.62023Paper ↗Looks wrong?
22ABINet-LV
ABINet Language-Vision variant. CVPR 2021.
verified93.42021Paper ↗Looks wrong?
23RCEED
From paper: Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition
verified91.82021Paper ↗Code ↗Looks wrong?
24SRN
From paper: Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
verified91.52020Paper ↗Code ↗Looks wrong?
25SATRN
From paper: On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
verified91.32019Paper ↗Code ↗Looks wrong?
26CSTR
From paper: Revisiting Classification Perspective on Scene Text Recognition
verified90.62021Paper ↗Code ↗Looks wrong?
27TextScanner
From paper: TextScanner: Reading Characters in Order for Robust Scene Text Recognition
verified90.12019Paper ↗Looks wrong?
28SEED
From paper: SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
verified89.62020Paper ↗Code ↗Looks wrong?
29ASTER
From paper: ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
verified89.52018Paper ↗Code ↗Looks wrong?
30DAN
From paper: Decoupled Attention Network for Text Recognition
verified89.22019Paper ↗Code ↗Looks wrong?
31SAFL
From paper: SAFL: A Self-Attention Scene Text Recognizer with Focal Loss
verified88.62022Paper ↗Code ↗Looks wrong?
32ViTSTR
From paper: Vision Transformer for Fast and Efficient Scene Text Recognition
verified87.72021Paper ↗Code ↗Looks wrong?
33Baek et al.
From paper: What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis
verified87.52019Paper ↗Code ↗Looks wrong?
34CA-FCN
From paper: Scene Text Recognition from Two-Dimensional Perspective
verified86.42018Paper ↗Looks wrong?
35SAR
From paper: Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
verified84.52018Paper ↗Code ↗Looks wrong?
36STAR-Net
From paper: Star-net: A spatial attention residue network for scene text recognition.
verified83.62016Paper ↗Code ↗Looks wrong?
37RARE
From paper: Robust Scene Text Recognition with Automatic Rectification
verified81.92016Paper ↗Code ↗Looks wrong?
38CRNN
From paper: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
verified80.82015Paper ↗Code ↗Looks wrong?
39CHAR
From paper: Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition
verified682014Paper ↗Code ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Recognition
svt Leaderboard | CodeSOTA | CodeSOTA