OCRBench v2

South China University of Technology

Tests 8 core OCR capabilities across 23 tasks. Evaluates LMMs on text recognition, referring, extraction.

Benchmark Stats

Models28
Papers34
Metrics2

SOTA History

Overall (Chinese)

Average score on Chinese private test set

Higher is better

RankModelSourceScoreYearPaper
1gemini-25-pro

Chinese, Private split. #1 on Chinese

Editorial62.22025Source
2Qianfan-OCR

Baidu Qianfan-OCR 4B (Qwen3-4B + Qianfan-ViT), Apache 2.0, 192 langs. Layout-as-Thought. #1 on zh

Editorial60.772026Source
3minicpm-v-4.5-8b

Chinese, Private split. #4 overall

Editorial58.82025Source
4sail-vl2-8bEditorial57.62025Source
5claude-3.5-sonnetEditorial48.42025Source
6gpt-4o-2024Editorial45.72025Source

Overall (English)

Average score on English private test set

Higher is better

RankModelSourceScoreYearPaper
1seed-1.6-vision

English, Private split. #1 on OCRBench v2

Editorial62.22025Source
2qwen3-omni-30bEditorial61.32025Source
3nemotron-nano-v2-vlEditorial61.22025Source
4gemini-25-proEditorial59.32025Source
5llama-3.1-nemotron-nano-vl-8bEditorial56.42025Source
6Qianfan-OCR

Baidu Qianfan-OCR 4B (Qwen3-4B + Qianfan-ViT), Apache 2.0, 192 langs. Layout-as-Thought.

Editorial562026Source
7gpt-4o

Listed as GPT5-2025-08-07 on leaderboard

Editorial55.52025Source
8ovis2.5-8bEditorial54.12025Source
9gemini-1.5-proEditorial51.62025Source
10sail-vl2-8bEditorial49.32025Source
11minicpm-v-4.5-8bEditorial48.42025Source
12gpt-4o-2024

GPT-4o baseline (not GPT5-2025-08-07)

Editorial47.62025Source
13claude-3.5-sonnetEditorial47.52025Source
14internvl3.5-14bEditorial47.12025Source
15step-1vEditorial46.82025Source
16grok4Editorial452025Source
17gpt-4o-miniEditorial44.12025Source
18claude-sonnet-4

Claude-sonnet-4-20250514

Editorial42.42025Source
19qwen2.5-vl-7bEditorial41.82025Source
20deepseek-vl2-smallEditorial412025Source
21pixtral-12bEditorial38.42025Source
22phi-4-multimodalEditorial38.12025Source
23glm-4v-9bEditorial37.12025Source
24molmo-7bEditorial33.92025Source
25llava-ov-7bEditorial33.72025Source
26idefics3-8bEditorial262025Source
27mistral-ocr-2512

Verified via CodeSOTA benchmark. 7,400 English samples. Mistral OCR is a pure OCR model (text extraction only) - not designed for VQA, chart parsing, or structured extraction tasks. Strong on full-page OCR (79.1%) and document parsing (55.2%).

Editorial25.22025Source
28docowl2Editorial23.42025Source

Submit a Result