Codesota · Benchmark · svtpHome/Leaderboards/Vision & Documents/Scene Text Recognition/svtp
Unknown

svtp.

svtp is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for svtp.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Accuracy

Accuracy is the reported evaluation metric for svtp. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DTrOCR 105M
From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition
verified98.62023Paper ↗Code ↗Looks wrong?
02MGP-STR
From paper: Multi-Granularity Prediction for Scene Text Recognition
verified98.32022Paper ↗Code ↗Looks wrong?
03CLIP4STR-L (DataComp-1B)
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified98.12023Paper ↗Code ↗Looks wrong?
04CLIP4STR-L
From paper: An Empirical Study of Scaling Law for OCR
verified97.42023Paper ↗Code ↗Source ↗Looks wrong?
05CLIP4STR-B
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified97.22023Paper ↗Code ↗Looks wrong?
06PARSeq
Lowercase alphanum eval. ECCV 2022.
verified96.92022Paper ↗Looks wrong?
07CPPD
From paper: Context Perception Parallel Decoder for Scene Text Recognition
verified96.72023Paper ↗Code ↗Looks wrong?
08CCD-ViT-Base
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified96.12022Paper ↗Code ↗Looks wrong?
09CCD-ViT-Small
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified92.72022Paper ↗Code ↗Looks wrong?
10CCD-ViT-Tiny
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified91.62022Paper ↗Code ↗Looks wrong?
11MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
verified90.62021Paper ↗Code ↗Looks wrong?
12S-GTR
From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
verified90.62021Paper ↗Code ↗Looks wrong?
13SIGA_T
From paper: Self-supervised Implicit Glyph Attention for Text Recognition
verified90.52022Paper ↗Code ↗Looks wrong?
14CDistNet (Ours)
From paper: CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
verified89.772021Paper ↗Code ↗Looks wrong?
15ABINet-LV
ABINet Language-Vision variant. CVPR 2021.
verified89.52021Paper ↗Looks wrong?
16DiffusionSTR
From paper: DiffusionSTR: Diffusion Model for Scene Text Recognition
verified89.22023Paper ↗Looks wrong?
17DPAN
From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
verified892021Paper ↗Code ↗Looks wrong?
18TrOCR-large 558M
TrOCR-large, Syn+Benchmark training. Table 6. AAAI 2023.
verified88.12021Paper ↗Looks wrong?
19TrOCR-base 334M
TrOCR-base, Syn+Benchmark training. Table 6. AAAI 2023.
verified86.92021Paper ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Recognition
svtp Leaderboard | CodeSOTA | CodeSOTA