Codesota · Computer Vision · General OCR Capabilities · OCRBench v2Tasks/Computer Vision/General OCR Capabilities
General OCR Capabilities · benchmark dataset · 2024 · MULTILINGUAL

OCRBench v2.

Tests 8 core OCR capabilities across 23 tasks. Evaluates LMMs on text recognition, referring, extraction.

Paper Submit a result
§ 01 · Leaderboard

Best published scores.

32 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
overall-en-private · higher is better
All metrics
overall-en-private, overall-zh-private
overall-en-private· primary
27 rows
#ModelOrgSubmittedPaper / codeoverall-en-private
01Seed1.6-visionAPIByteDanceJun 2025alphaxiv-leaderboard62.20
02Qwen3-Omni-30BOSSAlibabaApr 2025alphaxiv-leaderboard61.30
03Nemotron Nano V2 VLOSSNVIDIAMar 2025alphaxiv-leaderboard61.20
04Gemini 2.5 ProAPIGoogleMar 2025alphaxiv-leaderboard59.30
05llama-3.1-nemotron-nano-vl-8bMar 2025ocrbench-v2-leaderboard56.40
06GPT-4oAPIOpenAIMay 2024alphaxiv-leaderboard55.50
07ovis2.5-8bFeb 2025ocrbench-v2-leaderboard54.10
08gemini-1.5-proMay 2024ocrbench-v2-leaderboard51.60
09sail-vl2-8bMar 2025ocrbench-v2-leaderboard49.30
10minicpm-v-4.5-8bMay 2025ocrbench-v2-leaderboard48.40
11gpt-4o-2024May 2024ocrbench-v2-leaderboard47.60
12claude-3.5-sonnetJun 2024ocrbench-v2-leaderboard47.50
13internvl3.5-14bJun 2025ocrbench-v2-leaderboard47.10
14step-1vDec 2024ocrbench-v2-leaderboard46.80
15grok4Jul 2025ocrbench-v2-leaderboard45
16GPT-4o miniOpenAIJul 2024ocrbench-v2-leaderboard44.10
17Claude Sonnet 4APIAnthropicMay 2025ocrbench-v2-leaderboard42.40
18qwen2.5-vl-7bJan 2025ocrbench-v2-leaderboard41.80
19deepseek-vl2-smallDec 2024ocrbench-v2-leaderboard41
20pixtral-12bSep 2024ocrbench-v2-leaderboard38.40
21phi-4-multimodalFeb 2025ocrbench-v2-leaderboard38.10
22glm-4v-9bJun 2024ocrbench-v2-leaderboard37.10
23molmo-7bSep 2024ocrbench-v2-leaderboard33.90
24llava-ov-7bOct 2024ocrbench-v2-leaderboard33.70
25idefics3-8bAug 2024ocrbench-v2-leaderboard26
26mistral-ocr-2512Dec 2024codesota-verified25.20
27docowl2May 2024ocrbench-v2-leaderboard23.40
overall-zh-private
5 rows
#ModelOrgSubmittedPaper / codeoverall-zh-private
01Gemini 2.5 ProAPIGoogleMar 2025alphaxiv-leaderboard62.20
02minicpm-v-4.5-8bMay 2025ocrbench-v2-leaderboard58.80
03sail-vl2-8bMar 2025ocrbench-v2-leaderboard57.60
04claude-3.5-sonnetJun 2024ocrbench-v2-leaderboard48.40
05gpt-4o-2024May 2024ocrbench-v2-leaderboard45.70
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

4 steps
of state of the art.

Each row below marks a model that broke the previous record on overall-en-private. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · overall-en-private
  1. May 13, 2024GPT-4oOpenAI55.50
  2. Mar 18, 2025Nemotron Nano V2 VLNVIDIA61.20
  3. Apr 29, 2025Qwen3-Omni-30BAlibaba61.30
  4. Jun 15, 2025Seed1.6-visionByteDance62.20
Fig 3 · SOTA-setting models only. 4 entries span May 2024 Jun 2025.
§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies