Codesota · Computer Vision · Scene Text Recognition · svtpTasks/Computer Vision/Scene Text Recognition
Scene Text Recognition · benchmark dataset · 2020 · EN

svtp.

Dataset from Papers With Code

Submit a result
§ 01 · Leaderboard

Best published scores.

19 results indexed across 1 metric. Shaded row marks current SOTA; ties broken by submission date.


Primary
accuracy · higher is better
accuracy· primary
19 rows
#ModelOrgSubmittedPaper / codeaccuracy
01DTrOCR 105MAug 2023DTrOCR: Decoder-only Transformer for Optical Character R… · code98.60
02MGP-STRSep 2022Multi-Granularity Prediction for Scene Text Recognition · code98.30
03CLIP4STR-L (DataComp-1B)May 2023CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code98.10
04CLIP4STR-LMay 2023CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code97.40
05CLIP4STR-BResearchMay 2023CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code97.20
06PARSeqOSSResearchJul 2022Scene Text Recognition with Permuted Autoregressive Sequ…96.90
07CPPDJul 2023Context Perception Parallel Decoder for Scene Text Recog… · code96.70
08CCD-ViT-BaseNov 2022Self-supervised Character-to-Character Distillation for … · code96.10
09CCD-ViT-SmallNov 2022Self-supervised Character-to-Character Distillation for … · code92.70
10CCD-ViT-TinyNov 2022Self-supervised Character-to-Character Distillation for … · code91.60
11MATRNResearchNov 2021Multi-modal Text Recognition Networks: Interactive Enhan… · code90.60
12S-GTRDec 2021Visual Semantics Allow for Textual Reasoning Better in S… · code90.60
13SIGA_TMar 2022Self-supervised Implicit Glyph Attention for Text Recogn… · code90.50
14CDistNet (Ours)Nov 2021CDistNet: Perceiving Multi-Domain Character Distance for… · code89.77
15ABINet-LVOSSFang et al.Mar 2021Read Like Humans: Autonomous, Bidirectional and Iterativ…89.50
16DiffusionSTRJun 2023DiffusionSTR: Diffusion Model for Scene Text Recognition89.20
17DPANAug 2021papers-with-code · code89
18TrOCR-large 558MSep 2021TrOCR: Transformer-based Optical Character Recognition w…88.10
19TrOCR-base 334MSep 2021TrOCR: Transformer-based Optical Character Recognition w…86.90
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

6 steps
of state of the art.

Each row below marks a model that broke the previous record on accuracy. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · accuracy
  1. Mar 6, 2021ABINet-LVFang et al.89.50
  2. Nov 22, 2021CDistNet (Ours)89.77
  3. Nov 30, 2021MATRNResearch90.60
  4. Jul 14, 2022PARSeqResearch96.90
  5. Sep 8, 2022MGP-STR98.30
  6. Aug 30, 2023DTrOCR 105M98.60
Fig 3 · SOTA-setting models only. 6 entries span Mar 2021 Aug 2023.
§ 04 · Literature

13 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies