Who leads the svt benchmark?

CLIP4STR-H (DFN-5B) currently leads svt with a score of 99.1 on Accuracy.

What is the state-of-the-art score on svt?

The state-of-the-art result on svt is 99.1 (Accuracy), achieved by CLIP4STR-H (DFN-5B) as of 2023.

How many models are tracked on svt?

Codesota tracks 39 models on svt.

When was the svt leaderboard last updated?

The svt leaderboard on Codesota includes results through 2023, with the earliest tracked result from 2014.

Codesota · Benchmark · svtHome/Leaderboards/Vision & Documents/Scene Text Recognition/svt

Unknown

svt.

Name: svt Benchmark Results
Creator: Unknown
Published: 2014-01-01
License: https://creativecommons.org/licenses/by/4.0/

svt is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for svt.

Paper ↗Leaderboard ↓

§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?

Use row edits to send a sourced correction into moderation.

Add / edit result ↗Report issue ↗

Accuracy

Accuracy is the reported evaluation metric for svt. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	CLIP4STR-H (DFN-5B) From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model	verified	99.1	2023	Paper ↗Code ↗	Looks wrong?
02	DTrOCR 105M From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition	verified	98.9	2023	Paper ↗Code ↗	Looks wrong?
03	MGP-STR From paper: Multi-Granularity Prediction for Scene Text Recognition	verified	98.6	2022	Paper ↗Code ↗	Looks wrong?
04	CLIP4STR-L (DataComp-1B) From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model	verified	98.6	2023	Paper ↗Code ↗	Looks wrong?
05	CPPD From paper: Context Perception Parallel Decoder for Scene Text Recognition	verified	98.5	2023	Paper ↗Code ↗	Looks wrong?
06	CLIP4STR-L From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model	verified	98.5	2023	Paper ↗Code ↗	Looks wrong?
07	CLIP4STR-B From paper: An Empirical Study of Scaling Law for OCR	verified	98.3	2023	Paper ↗Code ↗Source ↗	Looks wrong?
08	PARSeq Lowercase alphanum eval. ECCV 2022.	verified	97.84	2022	Paper ↗	Looks wrong?
09	CCD-ViT-Base(ARD_2.8M) From paper: Self-supervised Character-to-Character Distillation for Text Recognition	verified	97.8	2022	Paper ↗Code ↗	Looks wrong?
10	CCD-ViT-Small(ARD_2.8M) From paper: Self-supervised Character-to-Character Distillation for Text Recognition	verified	96.4	2022	Paper ↗Code ↗	Looks wrong?
11	TrOCR-large 558M TrOCR-large, Syn+Benchmark training. Table 6. AAAI 2023.	verified	96.1	2021	Paper ↗	Looks wrong?
12	CCD-ViT-Tiny(ARD_2.8M) From paper: Self-supervised Character-to-Character Distillation for Text Recognition	verified	96	2022	Paper ↗Code ↗	Looks wrong?
13	S-GTR From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition	verified	95.8	2021	Paper ↗Code ↗	Looks wrong?
14	TrOCR-base 334M TrOCR-base, Syn+Benchmark training. Table 6. AAAI 2023.	verified	95.2	2021	Paper ↗	Looks wrong?
15	SIGA_T From paper: Self-supervised Implicit Glyph Attention for Text Recognition	verified	95.1	2022	Paper ↗Code ↗	Looks wrong?
16	MATRN From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features	verified	95	2021	Paper ↗Code ↗	Looks wrong?
17	Yet Another Text Recognizer From paper: Why You Should Try the Real Data for the Scene Text Recognition	verified	94.7	2021	Paper ↗Code ↗	Looks wrong?
18	NRTR+TPS++ From paper: TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition	verified	94.6	2023	Paper ↗Code ↗	Looks wrong?
19	DPAN From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition	verified	93.9	2021	Paper ↗Code ↗	Looks wrong?
20	CDistNet (Ours) From paper: CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition	verified	93.82	2021	Paper ↗Code ↗	Looks wrong?
21	DiffusionSTR From paper: DiffusionSTR: Diffusion Model for Scene Text Recognition	verified	93.6	2023	Paper ↗	Looks wrong?
22	ABINet-LV ABINet Language-Vision variant. CVPR 2021.	verified	93.4	2021	Paper ↗	Looks wrong?
23	RCEED From paper: Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition	verified	91.8	2021	Paper ↗Code ↗	Looks wrong?
24	SRN From paper: Towards Accurate Scene Text Recognition with Semantic Reasoning Networks	verified	91.5	2020	Paper ↗Code ↗	Looks wrong?
25	SATRN From paper: On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention	verified	91.3	2019	Paper ↗Code ↗	Looks wrong?
26	CSTR From paper: Revisiting Classification Perspective on Scene Text Recognition	verified	90.6	2021	Paper ↗Code ↗	Looks wrong?
27	TextScanner From paper: TextScanner: Reading Characters in Order for Robust Scene Text Recognition	verified	90.1	2019	Paper ↗	Looks wrong?
28	SEED From paper: SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition	verified	89.6	2020	Paper ↗Code ↗	Looks wrong?
29	ASTER From paper: ASTER: An Attentional Scene Text Recognizer with Flexible Rectification	verified	89.5	2018	Paper ↗Code ↗	Looks wrong?
30	DAN From paper: Decoupled Attention Network for Text Recognition	verified	89.2	2019	Paper ↗Code ↗	Looks wrong?
31	SAFL From paper: SAFL: A Self-Attention Scene Text Recognizer with Focal Loss	verified	88.6	2022	Paper ↗Code ↗	Looks wrong?
32	ViTSTR From paper: Vision Transformer for Fast and Efficient Scene Text Recognition	verified	87.7	2021	Paper ↗Code ↗	Looks wrong?
33	Baek et al. From paper: What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis	verified	87.5	2019	Paper ↗Code ↗	Looks wrong?
34	CA-FCN From paper: Scene Text Recognition from Two-Dimensional Perspective	verified	86.4	2018	Paper ↗	Looks wrong?
35	SAR From paper: Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition	verified	84.5	2018	Paper ↗Code ↗	Looks wrong?
36	STAR-Net From paper: Star-net: A spatial attention residue network for scene text recognition.	verified	83.6	2016	Paper ↗Code ↗	Looks wrong?
37	RARE From paper: Robust Scene Text Recognition with Automatic Rectification	verified	81.9	2016	Paper ↗Code ↗	Looks wrong?
38	CRNN From paper: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition	verified	80.8	2015	Paper ↗Code ↗	Looks wrong?
39	CHAR From paper: Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition	verified	68	2014	Paper ↗Code ↗	Looks wrong?

§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Recognition