Who leads the iiit5k benchmark?

CLIP4STR-L (DataComp-1B) currently leads iiit5k with a score of 99.60 on accuracy.

What is the state-of-the-art score on iiit5k?

The state-of-the-art result on iiit5k is 99.60 (accuracy), achieved by CLIP4STR-L (DataComp-1B) as of 2023.

How many models are tracked on iiit5k?

Codesota tracks 21 models on iiit5k.

When was the iiit5k leaderboard last updated?

The iiit5k leaderboard on Codesota includes results through 2023, with the earliest tracked result from 2015.

Codesota · Computer Vision · Scene Text Recognition · iiit5kTasks/Computer Vision/Scene Text Recognition

Scene Text Recognition · benchmark dataset · 2020 · EN

iiit5k.

Name: iiit5k Benchmark Results
Creator: Codesota
Published: 2015-01-01
License: https://creativecommons.org/licenses/by/4.0/

Dataset from Papers With Code

Submit a result ↵

§ 01 · Leaderboard

Best published scores.

21 results indexed across 1 metric. Shaded row marks current SOTA; ties broken by submission date.

Primary: accuracy · higher is better

accuracy· primary

21 rows

#	Model	Org	Submitted	Paper / code	accuracy
01	CLIP4STR-L (DataComp-1B)	—	May 2023	CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code	99.60
02	DTrOCR 105M	—	Aug 2023	DTrOCR: Decoder-only Transformer for Optical Character R… · code	99.60
03	CLIP4STR-B (DataComp-1B)	—	May 2023	CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code	99.50
04	CLIP4STR-L	—	May 2023	CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code	99.50
05	CPPD	—	Jul 2023	Context Perception Parallel Decoder for Scene Text Recog… · code	99.30
06	CLIP4STR-B	Research	May 2023	CLIP4STR: A Simple Baseline for Scene Text Recognition w… · code	99.20
07	PARSeqOpen	Research	Jul 2022	Scene Text Recognition with Permuted Autoregressive Sequ…	99
08	MGP-STR	—	Sep 2022	Multi-Granularity Prediction for Scene Text Recognition · code	98.80
09	CCD-ViT-Small(ARD_2.8M)	—	Nov 2022	Self-supervised Character-to-Character Distillation for … · code	98
10	CCD-ViT-Base(ARD_2.8M)	—	Nov 2022	Self-supervised Character-to-Character Distillation for … · code	98
11	S-GTR	—	Dec 2021	Visual Semantics Allow for Textual Reasoning Better in S… · code	97.50
12	DiffusionSTR	—	Jun 2023	DiffusionSTR: Diffusion Model for Scene Text Recognition	97.30
13	CCD-ViT-Tiny(ARD_2.8M)	—	Nov 2022	Self-supervised Character-to-Character Distillation for … · code	97.10
14	SIGA_S	—	Mar 2022	Self-supervised Implicit Glyph Attention for Text Recogn… · code	96.90
15	MATRN	Research	Nov 2021	Multi-modal Text Recognition Networks: Interactive Enhan… · code	96.60
16	CDistNet (Ours)	—	Nov 2021	CDistNet: Perceiving Multi-Domain Character Distance for… · code	96.57
17	ABINet-LVOpen	Fang et al.	Mar 2021	Read Like Humans: Autonomous, Bidirectional and Iterativ…	96.40
18	DPAN	—	Aug 2021	papers-with-code · code	96.20
19	TrOCR-large 558M	—	Sep 2021	TrOCR: Transformer-based Optical Character Recognition w…	94.10
20	TrOCR-base 334M	—	Sep 2021	TrOCR: Transformer-based Optical Character Recognition w…	93.40
21	CRNN	—	Jul 2015	An End-to-End Trainable Neural Network for Image-based S…	78.20

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 03 · Progress

7 steps
of state of the art.

Each row below marks a model that broke the previous record on accuracy. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · accuracy

Jul 21, 2015CRNN78.20
Mar 6, 2021ABINet-LVFang et al.96.40
Nov 22, 2021CDistNet (Ours)96.57
Nov 30, 2021MATRNResearch96.60
Dec 24, 2021S-GTR97.50
Jul 14, 2022PARSeqResearch99
May 23, 2023CLIP4STR-L (DataComp-1B)99.60

Fig 3 · SOTA-setting models only. 7 entries span Jul 2015 → May 2023.

§ 04 · Literature

14 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

DTrOCR: Decoder-only Transformer for Optical Character Recognition
Aug 2023·DTrOCR 105M
arXiv ↗Code
Context Perception Parallel Decoder for Scene Text Recognition
Jul 2023·CPPD
arXiv ↗Code
DiffusionSTR: Diffusion Model for Scene Text Recognition
Jun 2023·DiffusionSTR
arXiv ↗
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
May 2023·CLIP4STR-L (DataComp-1B), CLIP4STR-B (DataComp-1B), CLIP4STR-L +1
arXiv ↗Code
Self-supervised Character-to-Character Distillation for Text Recognition
Nov 2022·CCD-ViT-Small(ARD_2.8M), CCD-ViT-Base(ARD_2.8M), CCD-ViT-Tiny(ARD_2.8M)
arXiv ↗Code
Multi-Granularity Prediction for Scene Text Recognition
Sep 2022·MGP-STR
arXiv ↗Code
Scene Text Recognition with Permuted Autoregressive Sequence Models
Jul 2022·PARSeq
arXiv ↗
Self-supervised Implicit Glyph Attention for Text Recognition
Mar 2022·SIGA_S
arXiv ↗Code
Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
Dec 2021·S-GTR
arXiv ↗Code
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
Nov 2021·MATRN
arXiv ↗Code
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
Nov 2021·CDistNet (Ours)
arXiv ↗Code
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Sep 2021·TrOCR-large 558M, TrOCR-base 334M
arXiv ↗
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
Mar 2021·ABINet-LV
arXiv ↗
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
Jul 2015·CRNN
arXiv ↗

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

iiit5k.

Best published scores.

7 stepsof state of the art.

14 paperstied to this benchmark.

Neighbouring benchmarks.

Have a score that beatsthis table?

7 steps
of state of the art.

14 papers
tied to this benchmark.

Have a score that beats
this table?