Tests 8 core OCR capabilities across 23 tasks. Evaluates LMMs on text recognition, referring, extraction.
32 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.
| # | Model | Org | Submitted | Paper / code | overall-zh-private |
|---|---|---|---|---|---|
| 01 | Gemini 2.5 ProAPI | Mar 2025 | alphaxiv-leaderboard | 62.20 | |
| 02 | minicpm-v-4.5-8b | — | May 2025 | ocrbench-v2-leaderboard | 58.80 |
| 03 | sail-vl2-8b | — | Mar 2025 | ocrbench-v2-leaderboard | 57.60 |
| 04 | claude-3.5-sonnet | — | Jun 2024 | ocrbench-v2-leaderboard | 48.40 |
| 05 | gpt-4o-2024 | — | May 2024 | ocrbench-v2-leaderboard | 45.70 |
Each row below marks a model that broke the previous record on overall-en-private. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.
Higher scores win. Each subsequent entry improved upon the previous best.
Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.