Codesota · OCR · Benchmarks directory7 of 25 benchmarks carry live leaderboard dataUpdated 2026-04-20
§ 00 · Directory
The benchmarks we trust.
25 OCR benchmarks tracked across document, video, handwriting, scene text and multilingual tasks. The benchmarks a model clears — not the ones it claims to clear — are how we rank it.
Benchmarks with 1008 live data points link through to full leaderboards. Entries still being collected are listed but marked pending.
§ 01 · Document OCR
Document, 8 benchmarks.
| Benchmark | Organisation | Current leader | Results | |
|---|---|---|---|---|
| OmniDocBench | Shanghai AI Laboratory | PaddleOCR-VL | 47 | Leaderboard → |
| olmOCR-Bench | Allen Institute for AI | Chandra OCR 0.1.0 | 55 | Leaderboard → |
| OCRBench v2 | South China University of Technology | Seed1.6-vision | 74 | Leaderboard → |
| CC-OCR | South China University of Technology | Gemini-1.5-Pro | 12 | Leaderboard → |
| OHRBench | Shanghai AI Laboratory | Ground Truth | — | pending |
| LLMDoc | Alibaba Group | mPLUG-DocOwl | — | pending |
| Misraj-DocOCR | Misraj | Baseer | — | pending |
| EPHOIE | South China University of Technology | VIES | — | pending |
§ 02 · Video OCR
Video, 3 benchmarks.
| Benchmark | Organisation | Current leader | Results | |
|---|---|---|---|---|
| VTPBench | South China University of Technology | TextCtrl | — | pending |
| VideoDB OCR | VideoDB | GPT-4o | — | pending |
| MME-VideoOCR | NTU | Gemini-2.5 Pro | 6 | Leaderboard → |
§ 03 · Handwriting OCR
Handwriting, 3 benchmarks.
| Benchmark | Organisation | Current leader | Results | |
|---|---|---|---|---|
| UIT-HWDB | University of Information Technology | TransformerOCR | — | pending |
| Chinese Text Recognition | Fudan University | TransOCR | — | pending |
| Digital Peter | Moscow Institute of Physics | GRCNN | — | pending |
§ 04 · Scene Text OCR
Scene Text, 5 benchmarks.
| Benchmark | Organisation | Current leader | Results | |
|---|---|---|---|---|
| DROBS | NVIDIA | ECLAIR-MIP | — | pending |
| CORU | University of Innsbruck | GPT-4o | — | pending |
| Union14M | South China University of Technology | MAERec-B | — | pending |
| TextZoom | Nanjing University | TSRN | — | pending |
| ICDAR 2015 | Megvii Inc. | Stradvision | — | pending |
§ 05 · Multilingual OCR
Multilingual, 6 benchmarks.
| Benchmark | Organisation | Current leader | Results | |
|---|---|---|---|---|
| KITAB-Bench | MBZUAI | Gemini-2.0-Flash | 8 | Leaderboard → |
| ThaiOCRBench | SCB 10X | Claude Sonnet 4 | 5 | Leaderboard → |
| KRETA | Seoul National University | Gemini-2.0-flash | — | pending |
| KOCRBench | KL-Net | Gemini 2.5 Flash | — | pending |
| MOTBench | Shanghai Jiao Tong University | Gemini-2.0-Flash | — | pending |
| JaPOC | Fast Accounting Co., Ltd. | T5 (Retrieva) | — | pending |
§ Final · Methodology
Which benchmarks make the cut.
A benchmark enters the directory once it has a public test split, a stable evaluation harness and at least one third-party reproduction. Where a benchmark is known to be saturated or likely contaminated, we flag it on the detail page rather than remove it — the score is still signal, just weaker signal. See the methodology for the admission rules.
Related OCR reading