Who leads the OCRBench benchmark?

qwen3-5-397b-a17b currently leads OCRBench with a score of 931 on score.

What is the state-of-the-art score on OCRBench?

The state-of-the-art result on OCRBench is 931 (score), achieved by qwen3-5-397b-a17b as of 2026.

How many models are tracked on OCRBench?

Codesota tracks 30 models on OCRBench.

When was the OCRBench leaderboard last updated?

The OCRBench leaderboard on Codesota includes results through 2026.

Codesota · Benchmark · OCRBenchHome/Leaderboards/OCRBench

Unknown

OCRBench.

Name: OCRBench Benchmark Results
Creator: Unknown
Published: 2026-01-01
License: https://creativecommons.org/licenses/by/4.0/

Composite OCR capability benchmark for multimodal models. CodeSOTA stores Score on the 0-1000 convention when source rows are reported as percentages.

Paper ↗Leaderboard ↓

§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?

Use row edits to send a sourced correction into moderation.

Add / edit result ↗Report issue ↗

score

Score is the reported evaluation metric for OCRBench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for scoreverifiedpapervendorcommunityunverified

Rank	Model	Trust	Score	Year	Links	Edit
01	qwen3-5-397b-a17b Official Qwen3.5 blog (https://qwen.ai/blog?id=qwen3.5). Vision table row OCRBench; linked to OCR task using existing Score metric.; PWC evaluation id 1060; paper: Qwen3.5: Towards Native Multimodal Agents	verified	931	2026	Source ↗	Edit result
02	Kimi K2.5 OCRBench overall score on the native 0-1000 scale (card reports 92.3 normalized); max 64k tokens; avg@3; Thinking mode.; PWC evaluation id 1252; paper: Kimi K2.5: Visual Agentic Intelligence	verified	923	2026	Source ↗	Edit result
03	qwen3-vl-235b-a22b-instruct Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4749; paper: Qwen3-VL Technical Report	verified	920	2026	Source ↗	Edit result
04	sensenova-u1-a3b-mot PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 3; SenseNova-U1-A3B-MoT Think mode on OCRBench.; PWC evaluation id 5612; paper: SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture	verified	919	2026	Source ↗	Edit result
05	qwen3-5-omni-plus PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 6, Vision->Text, OCRBench document understanding benchmark.; PWC evaluation id 5329; paper: Qwen3.5-Omni Technical Report	verified	913	2026	Source ↗	Edit result
06	internvl3-78b PWC evaluation id 768; paper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models	verified	906	2026	Source ↗	Edit result
07	qwen3-6-35b-a3b PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5562; paper: Qwen3.6	verified	900	2026	Source ↗	Edit result
08	qwen3-vl-8b-instruct Reported on OCRBench (raw 0-1000 score) by Qwen3-VL-8B-Instruct model card.; PWC evaluation id 1279; paper: Qwen3-VL Technical Report	verified	896	2026	Source ↗	Edit result
09	qwen3-6-27b PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5563; paper: Qwen3.6	verified	894	2026	Source ↗	Edit result
10	Qwen2.5-VL 72B Table 5, OCRBench. Source: Qwen2.5-VL Technical Report (arXiv:2502.13923). Model: Qwen2.5-VL-72B.; PWC evaluation id 5023; paper: Qwen2.5-VL Technical Report	verified	885	2026	Source ↗	Edit result
11	Qianfan-OCR OCRBench standard Score (0-1000); 880.; PWC evaluation id 1197; paper: Qianfan-OCR: A Unified End-to-End Model for Document Intelligence	verified	880	2026	Source ↗	Edit result
12	ovis2-5-9b PWC OCRBench Score normalized to the 0-1000 convention.; Table 3, OpenCompass suite; OCRBench (OCR). Source/provenance: Ovis2.5 Technical Report; source arXiv paper https://arxiv.org/abs/2508.11737; official HF model URL https://huggingface.co/AIDC-AI/Ovis2.5-9B.; PWC evaluation id 5583; paper: Ovis2.5 Technical Report	verified	879	2026	Source ↗	Edit result
13	Qwen2-VL 72B PWC evaluation id 143; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	verified	877	2026	Source ↗	Edit result
14	minicpm-o-4-5-instruct Instruct mode from the openbmb/MiniCPM-o-4_5 Hugging Face model card (https://huggingface.co/openbmb/MiniCPM-o-4_5); 9B params; results reported in instruct mode/variant; from the 'Image Understanding (Instruct)' table; metric label in card: OCRBench; 0-1000 scale used by other rows on this leaderboard.; PWC evaluation id 1171; paper: MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction	verified	876	2026	Source ↗	Edit result
15	qwen3-vl-235b-a22b-thinking Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4748; paper: Qwen3-VL Technical Report	verified	875	2026	Source ↗	Edit result
16	kimi-vl-a3b-thinking-2506 Kimi-VL-A3B-Thinking-2506 on OCRBench overall score (raw 0-1000 scale) from the moonshotai/Kimi-VL-A3B-Thinking-2506 HF model card.; PWC evaluation id 3371; paper: Kimi-VL Technical Report	verified	869	2026	Source ↗	Edit result
17	kimi-vl-a3b-instruct Kimi-VL-A3B-Instruct on OCRBench overall score (raw 0-1000 scale) from Kimi-VL Technical Report Table 3 and the moonshotai/Kimi-VL-A3B-Instruct HF model card.; PWC evaluation id 3351; paper: Kimi-VL Technical Report	verified	867	2026	Source ↗	Edit result
18	qwen2-vl-7b PWC evaluation id 144; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	verified	866	2026	Source ↗	Edit result
19	minimax-vl-01 PWC evaluation id 886; paper: MiniMax-01: Scaling Foundation Models with Lightning Attention	verified	865	2026	Source ↗	Edit result
20	Qwen2.5-VL-7B paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5377; paper: Qwen2.5-VL Technical Report	verified	864	2026	Source ↗	Edit result
21	infinity-parser2-pro OCRBench full benchmark score from the Infinity-Parser2-Pro Hugging Face card / GitHub performance table. Source reports 86.20 on a 0-100 scale; stored as 862.0 on the 0-1000 OCRBench Score convention used by existing rows.; PWC evaluation id 4966; paper: Infinity-Parser2-Pro	verified	862	2026	Source ↗	Edit result
22	dots.mocr OCRBench overall score from the dots.mocr Hugging Face model card (section 3, General Vision Tasks). The card reports 86.0 on a 0-100 scale; converted to 860 on the standard 0-1000 OCRBench scale used by other rows on this leaderboard (consistent with Qwen3-VL-2B = 85.8 -> 858 in the same table).; PWC evaluation id 1153; paper: Multimodal OCR: Parse Anything from Documents	verified	860	2026	Source ↗	Edit result
23	hunyuanocr-1b PWC evaluation id 957; paper: HunyuanOCR Technical Report	verified	860	2026	Source ↗	Edit result
24	minicpm-v-4-6-thinking-16x Thinking mode from the MiniCPM-V 4.6 Hugging Face model card; official checkpoint; visual token compression ratio 16x; metric label in card: OCRBench.; PWC evaluation id 1115; paper: A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone	verified	831	2026	Source ↗	Edit result
25	videollama3-7b DAMO-NLP-SG/VideoLLaMA3-7B-Image checkpoint; numbers from the 7B-Image model card main-results table.; PWC evaluation id 1214; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	verified	828	2026	Source ↗	Edit result
26	qwen2-vl-2b PWC evaluation id 145; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	verified	809	2026	Source ↗	Edit result
27	zaya1-vl-8b OCRBench overall score (0-1000 scale; card reports 79.8 normalised). Reported in the ZAYA1-VL-8B technical report (Zyphra). Evaluated on the Zyphra eval harness based on VLMEvalKit.; PWC evaluation id 1230; paper: ZAYA1-VL-8B Technical Report	verified	798	2026	Source ↗	Edit result
28	qwen2-5-vl-3b paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5378; paper: Qwen2.5-VL Technical Report	verified	797	2026	Source ↗	Edit result
29	videollama3-2b PWC evaluation id 115; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	verified	779	2026	Source ↗	Edit result
30	minicpm-llama3-v-2-5 Paper Table 5 OCR benchmark result for MiniCPM-Llama3-V 2.5; source reports OCRBench score.; PWC evaluation id 5183; paper: MiniCPM-V: A GPT-4V Level MLLM on Your Phone	verified	725	2026	Source ↗	Edit result

§ 04 · Submit a result

Add to the leaderboard.

← Back to Leaderboards