OCRBench v2

South China University of Technology

Tests 8 core OCR capabilities across 23 tasks. Evaluates LMMs on text recognition, referring, extraction.

Benchmark Stats

Models27
Papers32
Metrics2

SOTA History

Coming Soon
Visual timeline of state-of-the-art progression over time will appear here.

Overall (English)

Average score on English private test set

Higher is better

RankModelCodeScorePaper / Source
1seed-1.6-vision

English, Private split. #1 on OCRBench v2

-62.2AlphaXiv
2qwen3-omni-30b-61.3AlphaXiv
3nemotron-nano-v2-vl-61.2AlphaXiv
4gemini-25-pro-59.3AlphaXiv
5llama-3.1-nemotron-nano-vl-8b-56.4ocrbench-v2-leaderboard
6gpt-4o

Listed as GPT5-2025-08-07 on leaderboard

-55.5AlphaXiv
7ovis2.5-8b-54.1ocrbench-v2-leaderboard
8gemini-1.5-pro-51.6ocrbench-v2-leaderboard
9sail-vl2-8b-49.3ocrbench-v2-leaderboard
10minicpm-v-4.5-8b-48.4ocrbench-v2-leaderboard
11gpt-4o-2024

GPT-4o baseline (not GPT5-2025-08-07)

-47.6ocrbench-v2-leaderboard
12claude-3.5-sonnet-47.5ocrbench-v2-leaderboard
13internvl3.5-14b-47.1ocrbench-v2-leaderboard
14step-1v-46.8ocrbench-v2-leaderboard
15grok4-45ocrbench-v2-leaderboard
16gpt-4o-mini-44.1ocrbench-v2-leaderboard
17claude-sonnet-4

Claude-sonnet-4-20250514

-42.4ocrbench-v2-leaderboard
18qwen2.5-vl-7b-41.8ocrbench-v2-leaderboard
19deepseek-vl2-small-41ocrbench-v2-leaderboard
20pixtral-12b-38.4ocrbench-v2-leaderboard
21phi-4-multimodal-38.1ocrbench-v2-leaderboard
22glm-4v-9b-37.1ocrbench-v2-leaderboard
23molmo-7b-33.9ocrbench-v2-leaderboard
24llava-ov-7b-33.7ocrbench-v2-leaderboard
25idefics3-8b-26ocrbench-v2-leaderboard
26mistral-ocr-2512

Verified via CodeSOTA benchmark. 7,400 English samples. Mistral OCR is a pure OCR model (text extraction only) - not designed for VQA, chart parsing, or structured extraction tasks. Strong on full-page OCR (79.1%) and document parsing (55.2%).

-25.2codesota-verified
27docowl2-23.4ocrbench-v2-leaderboard

Overall (Chinese)

Average score on Chinese private test set

Higher is better

RankModelCodeScorePaper / Source
1gemini-25-pro

Chinese, Private split. #1 on Chinese

-62.2AlphaXiv
2minicpm-v-4.5-8b

Chinese, Private split. #4 overall

-58.8ocrbench-v2-leaderboard
3sail-vl2-8b-57.6ocrbench-v2-leaderboard
4claude-3.5-sonnet-48.4ocrbench-v2-leaderboard
5gpt-4o-2024-45.7ocrbench-v2-leaderboard