Codesota · Benchmark · cute80Home/Leaderboards/Vision & Documents/Scene Text Recognition/cute80
Unknown

cute80.

cute80 is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for cute80.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Accuracy

Accuracy is the reported evaluation metric for cute80. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01CPPD
From paper: Context Perception Parallel Decoder for Scene Text Recognition
verified99.72023Paper ↗Code ↗Looks wrong?
02CLIP4STR-L (DataComp-1B)
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified99.72023Paper ↗Code ↗Looks wrong?
03MGP-STR
From paper: Multi-Granularity Prediction for Scene Text Recognition
verified99.312022Paper ↗Code ↗Looks wrong?
04CLIP4STR-B
From paper: An Empirical Study of Scaling Law for OCR
verified99.32023Paper ↗Code ↗Source ↗Looks wrong?
05DTrOCR 105M
From paper: DTrOCR: Decoder-only Transformer for Optical Character Recognition
verified99.12023Paper ↗Code ↗Looks wrong?
06CLIP4STR-L
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified992023Paper ↗Code ↗Looks wrong?
07PARSeq
Lowercase alphanum eval. ECCV 2022.
verified98.612022Paper ↗Looks wrong?
08CCD-ViT-Small(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified98.32022Paper ↗Code ↗Looks wrong?
09CCD-ViT-Base(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified98.32022Paper ↗Code ↗Looks wrong?
10CCD-ViT-Tiny(ARD_2.8M)
From paper: Self-supervised Character-to-Character Distillation for Text Recognition
verified95.82022Paper ↗Code ↗Looks wrong?
11S-GTR
From paper: Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
verified94.72021Paper ↗Code ↗Looks wrong?
12MATRN
From paper: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
verified93.52021Paper ↗Code ↗Looks wrong?
13SIGA_T
From paper: Self-supervised Implicit Glyph Attention for Text Recognition
verified93.12022Paper ↗Code ↗Looks wrong?
14DiffusionSTR
From paper: DiffusionSTR: Diffusion Model for Scene Text Recognition
verified92.52023Paper ↗Looks wrong?
15NRTR+TPS++
From paper: TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
verified92.42023Paper ↗Code ↗Looks wrong?
16DPAN
From paper: Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
verified91.92021Paper ↗Code ↗Looks wrong?
17CDistNet (Ours)
From paper: CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
verified89.582021Paper ↗Code ↗Looks wrong?
18ABINet-LV
ABINet Language-Vision variant. CVPR 2021.
verified89.22021Paper ↗Looks wrong?
19TrOCR-large 558M
TrOCR-large, Syn+Benchmark training. Table 6. AAAI 2023.
verified84.12021Paper ↗Looks wrong?
20TrOCR-base 334M
TrOCR-base, Syn+Benchmark training. Table 6. AAAI 2023.
verified81.22021Paper ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Recognition