Optical Character Recognition

Extracting text from document images

114
Datasets
17
Results
cer
Canonical metric
Canonical Benchmark

KITAB-Bench

8,809 Arabic text samples across 9 domains. Tests Arabic script recognition.

Primary metric: cer
View full leaderboard

Top 10

Leading models on KITAB-Bench.

RankModelcerYearSource
1
paddleocr
0.7902025paper
2
easyocr
0.5802025paper
3
tesseract
0.5402025paper
4
azure-ocr
0.5202025paper
5
gpt-4o-mini
0.4302025paper
6
gpt-4o
0.3102025paper
7
ain-7b
0.2002025paper
8
gemini-20-flash
0.1302025paper

What were you looking for on Optical Character Recognition?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

All datasets

114 datasets tracked for this task.

KITAB-Bench

CANONICAL
8results·cer
Top: paddleocr 0.790

ThaiOCRBench

5results·ted-score
Top: claude-sonnet-4 0.840

CodeSOTA Verification

1result
Top: mistral-ocr-2512 1.22

Internal Mistral Benchmark

1result
Top: mistral-ocr-3 94.9

OCR CER Benchmark

1result
Top: mistral-ocr-3 3.70

OCR WER Benchmark

1result
Top: mistral-ocr-3 7.10

CodeSOTA Polish

0results·cer

CodeSearchNet

0results·accuracy

IMPACT-PSNC

0results·cer

PolEval 2021 OCR

0results·cer

SROIE

0results·f1

aapd

0results·accuracy

amazon

0results·accuracy

and-dataset

0results·accuracy

arxiv-hep-th-citation-graph

0results·accuracy

arxiv-summarization-dataset

0results·accuracy

australian

0results·accuracy

ba

0results·accuracy

bbc-xsum

0results·accuracy

bbcsport

0results·accuracy

bc8

0results·accuracy

belfort

0results·accuracy

benchmarking-chinese-text-recognition:-datasets,-b

0results·accuracy

bentham

0results·accuracy

cedar-signature

0results·accuracy

cl-scisumm

0results·accuracy

classic

0results·accuracy

clueweb09-b

0results·accuracy

cnn-/-daily-mail

0results·accuracy

codesearchnet---go

0results·accuracy

codesearchnet---java

0results·accuracy

codesearchnet---javascript

0results·accuracy

codesearchnet---php

0results·accuracy

codesearchnet---python

0results·accuracy

codesearchnet---ruby

0results·accuracy

cub-200-2011

0results·accuracy

dareczech

0results·accuracy

dart

0results·accuracy

digital-peter

0results·accuracy

dise-2021-dataset

0results·accuracy

docred-ie

0results·accuracy

dwie

0results·accuracy

e2e

0results·accuracy

ephoie

0results·accuracy

food-101

0results·accuracy

fsns---test

0results·accuracy

hkr

0results·accuracy

hoc

0results·accuracy

howsumm-method

0results·accuracy

howsumm-step

0results·accuracy

hyperpartisan-news-detection

0results·accuracy

i2l-140k

0results·accuracy

iam(line-level)

0results·accuracy

iam-b

0results·accuracy

iam-d

0results·accuracy

icdar-2019

0results·accuracy

icdar2013

0results·accuracy

icdar2015

0results·accuracy

im2latex-100k

0results·accuracy

imdb-m

0results·accuracy

inverse-text

0results·accuracy

iris

0results·accuracy

jaffe

0results·accuracy

lam(line-level)

0results·accuracy

lun

0results·accuracy

mldoc-zero-shot-english-to-chinese

0results·accuracy

mldoc-zero-shot-english-to-french

0results·accuracy

mldoc-zero-shot-english-to-german

0results·accuracy

mldoc-zero-shot-english-to-italian

0results·accuracy

mldoc-zero-shot-english-to-japanese

0results·accuracy

mldoc-zero-shot-english-to-russian

0results·accuracy

mldoc-zero-shot-english-to-spanish

0results·accuracy

mldoc-zero-shot-german-to-french

0results·accuracy

mpqa

0results·accuracy

pendigits

0results·accuracy

pixraw10p

0results·accuracy

re-docred

0results·accuracy

read-2016

0results·accuracy

read2016(line-level)

0results·accuracy

recipe

0results·accuracy

reuters-21578

0results·accuracy

reuters-de-en

0results·accuracy

reuters-en-de

0results·accuracy

reuters-rcv1/rcv2-english-to-german

0results·accuracy

reuters-rcv1/rcv2-german-to-english

0results·accuracy

rotowire

0results·accuracy

saint-gall

0results·accuracy

scene-text-recognition-benchmarks

0results·accuracy

scidocs-(mag)

0results·accuracy

scidocs-(mesh)

0results·accuracy

scut-ctw1500

0results·accuracy

simara

0results·accuracy

stdw

0results·accuracy

sun-rgb-d

0results·accuracy

sut

0results·accuracy

tabfact

0results·accuracy

textseg

0results·accuracy

textzoom

0results·accuracy

tobacco-small-3482

0results·accuracy

twitter

0results·accuracy

urdudoc

0results·accuracy

videodb's-ocr-benchmark-public-collection

0results·accuracy

warppie10p

0results·accuracy

webnlg-(all)

0results·accuracy

webnlg-(seen)

0results·accuracy

webnlg-(unseen)

0results·accuracy

wikibio

0results·accuracy

wikilingua-(tr->en)

0results·accuracy

wikipedia-person-and-animal-dataset

0results·accuracy

wine

0results·accuracy

wos-11967

0results·accuracy

wos-46985

0results·accuracy

wos-5736

0results·accuracy

yelp-14

0results·accuracy

Related tasks

Other tasks in Computer Vision.

Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Optical Character Recognition? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.