ocrbench.

OCR benchmark

§ 01 · score

score.

Higher is better

#	Model	Score	Source
★	qwen3-5-397b-a17b Official Qwen3.5 blog (https://qwen.ai/blog?id=qwen3.5). Vision table row OCRBench; linked to OCR task using existing Score metric.; PWC evaluation id 1060; paper: Qwen3.5: Towards Native Multimodal Agents	931	paperswithcode-public-api
2	Kimi K2.5 OCRBench overall score on the native 0-1000 scale (card reports 92.3 normalized); max 64k tokens; avg@3; Thinking mode.; PWC evaluation id 1252; paper: Kimi K2.5: Visual Agentic Intelligence	923	paperswithcode-public-api
3	qwen3-vl-235b-a22b-instruct Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4749; paper: Qwen3-VL Technical Report	920	paperswithcode-public-api
4	sensenova-u1-a3b-mot PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 3; SenseNova-U1-A3B-MoT Think mode on OCRBench.; PWC evaluation id 5612; paper: SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture	919	paperswithcode-public-api
5	qwen3-5-omni-plus PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 6, Vision->Text, OCRBench document understanding benchmark.; PWC evaluation id 5329; paper: Qwen3.5-Omni Technical Report	913	paperswithcode-public-api
6	internvl3-78b PWC evaluation id 768; paper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models	906	paperswithcode-public-api
7	qwen3-6-35b-a3b PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5562; paper: Qwen3.6	900	paperswithcode-public-api
8	qwen3-vl-8b-instruct Reported on OCRBench (raw 0-1000 score) by Qwen3-VL-8B-Instruct model card.; PWC evaluation id 1279; paper: Qwen3-VL Technical Report	896	paperswithcode-public-api
9	qwen3-6-27b PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5563; paper: Qwen3.6	894	paperswithcode-public-api
10	Qwen2.5-VL 72B Table 5, OCRBench. Source: Qwen2.5-VL Technical Report (arXiv:2502.13923). Model: Qwen2.5-VL-72B.; PWC evaluation id 5023; paper: Qwen2.5-VL Technical Report	885	paperswithcode-public-api
11	Qianfan-OCR OCRBench standard Score (0-1000); 880.; PWC evaluation id 1197; paper: Qianfan-OCR: A Unified End-to-End Model for Document Intelligence	880	paperswithcode-public-api
12	ovis2-5-9b PWC OCRBench Score normalized to the 0-1000 convention.; Table 3, OpenCompass suite; OCRBench (OCR). Source/provenance: Ovis2.5 Technical Report; source arXiv paper https://arxiv.org/abs/2508.11737; official HF model URL https://huggingface.co/AIDC-AI/Ovis2.5-9B.; PWC evaluation id 5583; paper: Ovis2.5 Technical Report	879	paperswithcode-public-api
13	Qwen2-VL 72B PWC evaluation id 143; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	877	paperswithcode-public-api
14	minicpm-o-4-5-instruct Instruct mode from the openbmb/MiniCPM-o-4_5 Hugging Face model card (https://huggingface.co/openbmb/MiniCPM-o-4_5); 9B params; results reported in instruct mode/variant; from the 'Image Understanding (Instruct)' table; metric label in card: OCRBench; 0-1000 scale used by other rows on this leaderboard.; PWC evaluation id 1171; paper: MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction	876	paperswithcode-public-api
15	qwen3-vl-235b-a22b-thinking Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4748; paper: Qwen3-VL Technical Report	875	paperswithcode-public-api
16	kimi-vl-a3b-thinking-2506 Kimi-VL-A3B-Thinking-2506 on OCRBench overall score (raw 0-1000 scale) from the moonshotai/Kimi-VL-A3B-Thinking-2506 HF model card.; PWC evaluation id 3371; paper: Kimi-VL Technical Report	869	paperswithcode-public-api
17	kimi-vl-a3b-instruct Kimi-VL-A3B-Instruct on OCRBench overall score (raw 0-1000 scale) from Kimi-VL Technical Report Table 3 and the moonshotai/Kimi-VL-A3B-Instruct HF model card.; PWC evaluation id 3351; paper: Kimi-VL Technical Report	867	paperswithcode-public-api
18	qwen2-vl-7b PWC evaluation id 144; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	866	paperswithcode-public-api
19	minimax-vl-01 PWC evaluation id 886; paper: MiniMax-01: Scaling Foundation Models with Lightning Attention	865	paperswithcode-public-api
20	Qwen2.5-VL-7B paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5377; paper: Qwen2.5-VL Technical Report	864	paperswithcode-public-api
21	infinity-parser2-pro OCRBench full benchmark score from the Infinity-Parser2-Pro Hugging Face card / GitHub performance table. Source reports 86.20 on a 0-100 scale; stored as 862.0 on the 0-1000 OCRBench Score convention used by existing rows.; PWC evaluation id 4966; paper: Infinity-Parser2-Pro	862	paperswithcode-public-api
22	dots.mocr OCRBench overall score from the dots.mocr Hugging Face model card (section 3, General Vision Tasks). The card reports 86.0 on a 0-100 scale; converted to 860 on the standard 0-1000 OCRBench scale used by other rows on this leaderboard (consistent with Qwen3-VL-2B = 85.8 -> 858 in the same table).; PWC evaluation id 1153; paper: Multimodal OCR: Parse Anything from Documents	860	paperswithcode-public-api
23	hunyuanocr-1b PWC evaluation id 957; paper: HunyuanOCR Technical Report	860	paperswithcode-public-api
24	minicpm-v-4-6-thinking-16x Thinking mode from the MiniCPM-V 4.6 Hugging Face model card; official checkpoint; visual token compression ratio 16x; metric label in card: OCRBench.; PWC evaluation id 1115; paper: A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone	831	paperswithcode-public-api
25	videollama3-7b DAMO-NLP-SG/VideoLLaMA3-7B-Image checkpoint; numbers from the 7B-Image model card main-results table.; PWC evaluation id 1214; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	828	paperswithcode-public-api
26	qwen2-vl-2b PWC evaluation id 145; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	809	paperswithcode-public-api
27	zaya1-vl-8b OCRBench overall score (0-1000 scale; card reports 79.8 normalised). Reported in the ZAYA1-VL-8B technical report (Zyphra). Evaluated on the Zyphra eval harness based on VLMEvalKit.; PWC evaluation id 1230; paper: ZAYA1-VL-8B Technical Report	798	paperswithcode-public-api
28	qwen2-5-vl-3b paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5378; paper: Qwen2.5-VL Technical Report	797	paperswithcode-public-api
29	videollama3-2b PWC evaluation id 115; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	779	paperswithcode-public-api
30	minicpm-llama3-v-2-5 Paper Table 5 OCR benchmark result for MiniCPM-Llama3-V 2.5; source reports OCRBench score.; PWC evaluation id 5183; paper: MiniCPM-V: A GPT-4V Level MLLM on Your Phone	725	paperswithcode-public-api

§ Related · Explore

ocrbench.

score.

More OCR content.