Codesota · Benchmark · OCRBenchHome/Leaderboards/OCRBench
Unknown

OCRBench.

Composite OCR capability benchmark for multimodal models. CodeSOTA stores Score on the 0-1000 convention when source rows are reported as percentages.

Paper Leaderboard
§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

score

Score is the reported evaluation metric for OCRBench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for scoreverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01qwen3-5-397b-a17b
Official Qwen3.5 blog (https://qwen.ai/blog?id=qwen3.5). Vision table row OCRBench; linked to OCR task using existing Score metric.; PWC evaluation id 1060; paper: Qwen3.5: Towards Native Multimodal Agents
verified9312026Source ↗Edit result
02Kimi K2.5
OCRBench overall score on the native 0-1000 scale (card reports 92.3 normalized); max 64k tokens; avg@3; Thinking mode.; PWC evaluation id 1252; paper: Kimi K2.5: Visual Agentic Intelligence
verified9232026Source ↗Edit result
03qwen3-vl-235b-a22b-instruct
Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4749; paper: Qwen3-VL Technical Report
verified9202026Source ↗Edit result
04sensenova-u1-a3b-mot
PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 3; SenseNova-U1-A3B-MoT Think mode on OCRBench.; PWC evaluation id 5612; paper: SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture
verified9192026Source ↗Edit result
05qwen3-5-omni-plus
PWC OCRBench Score normalized to the 0-1000 convention.; Paper Table 6, Vision->Text, OCRBench document understanding benchmark.; PWC evaluation id 5329; paper: Qwen3.5-Omni Technical Report
verified9132026Source ↗Edit result
06internvl3-78b
PWC evaluation id 768; paper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
verified9062026Source ↗Edit result
07qwen3-6-35b-a3b
PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5562; paper: Qwen3.6
verified9002026Source ↗Edit result
08qwen3-vl-8b-instruct
Reported on OCRBench (raw 0-1000 score) by Qwen3-VL-8B-Instruct model card.; PWC evaluation id 1279; paper: Qwen3-VL Technical Report
verified8962026Source ↗Edit result
09qwen3-6-27b
PWC OCRBench Score normalized to the 0-1000 convention.; OCRBench row from Qwen3.6 model-card Vision benchmark table; imported as configured OCRBench Score. Source: Qwen3.6-27B Hugging Face model card benchmark table (https://huggingface.co/Qwen/Qwen3.6-27B).; PWC evaluation id 5563; paper: Qwen3.6
verified8942026Source ↗Edit result
10Qwen2.5-VL 72B
Table 5, OCRBench. Source: Qwen2.5-VL Technical Report (arXiv:2502.13923). Model: Qwen2.5-VL-72B.; PWC evaluation id 5023; paper: Qwen2.5-VL Technical Report
verified8852026Source ↗Edit result
11Qianfan-OCR
OCRBench standard Score (0-1000); 880.; PWC evaluation id 1197; paper: Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
verified8802026Source ↗Edit result
12ovis2-5-9b
PWC OCRBench Score normalized to the 0-1000 convention.; Table 3, OpenCompass suite; OCRBench (OCR). Source/provenance: Ovis2.5 Technical Report; source arXiv paper https://arxiv.org/abs/2508.11737; official HF model URL https://huggingface.co/AIDC-AI/Ovis2.5-9B.; PWC evaluation id 5583; paper: Ovis2.5 Technical Report
verified8792026Source ↗Edit result
13Qwen2-VL 72B
PWC evaluation id 143; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
verified8772026Source ↗Edit result
14minicpm-o-4-5-instruct
Instruct mode from the openbmb/MiniCPM-o-4_5 Hugging Face model card (https://huggingface.co/openbmb/MiniCPM-o-4_5); 9B params; results reported in instruct mode/variant; from the 'Image Understanding (Instruct)' table; metric label in card: OCRBench; 0-1000 scale used by other rows on this leaderboard.; PWC evaluation id 1171; paper: MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction
verified8762026Source ↗Edit result
15qwen3-vl-235b-a22b-thinking
Table 2 of Qwen3-VL technical report (arXiv:2511.21631), OCRBench (rescaled from 87.5/92.0 to the 0-1000 scale used by the existing rows).; PWC evaluation id 4748; paper: Qwen3-VL Technical Report
verified8752026Source ↗Edit result
16kimi-vl-a3b-thinking-2506
Kimi-VL-A3B-Thinking-2506 on OCRBench overall score (raw 0-1000 scale) from the moonshotai/Kimi-VL-A3B-Thinking-2506 HF model card.; PWC evaluation id 3371; paper: Kimi-VL Technical Report
verified8692026Source ↗Edit result
17kimi-vl-a3b-instruct
Kimi-VL-A3B-Instruct on OCRBench overall score (raw 0-1000 scale) from Kimi-VL Technical Report Table 3 and the moonshotai/Kimi-VL-A3B-Instruct HF model card.; PWC evaluation id 3351; paper: Kimi-VL Technical Report
verified8672026Source ↗Edit result
18qwen2-vl-7b
PWC evaluation id 144; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
verified8662026Source ↗Edit result
19minimax-vl-01
PWC evaluation id 886; paper: MiniMax-01: Scaling Foundation Models with Lightning Attention
verified8652026Source ↗Edit result
20Qwen2.5-VL-7B
paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5377; paper: Qwen2.5-VL Technical Report
verified8642026Source ↗Edit result
21infinity-parser2-pro
OCRBench full benchmark score from the Infinity-Parser2-Pro Hugging Face card / GitHub performance table. Source reports 86.20 on a 0-100 scale; stored as 862.0 on the 0-1000 OCRBench Score convention used by existing rows.; PWC evaluation id 4966; paper: Infinity-Parser2-Pro
verified8622026Source ↗Edit result
22dots.mocr
OCRBench overall score from the dots.mocr Hugging Face model card (section 3, General Vision Tasks). The card reports 86.0 on a 0-100 scale; converted to 860 on the standard 0-1000 OCRBench scale used by other rows on this leaderboard (consistent with Qwen3-VL-2B = 85.8 -> 858 in the same table).; PWC evaluation id 1153; paper: Multimodal OCR: Parse Anything from Documents
verified8602026Source ↗Edit result
23hunyuanocr-1b
PWC evaluation id 957; paper: HunyuanOCR Technical Report
verified8602026Source ↗Edit result
24minicpm-v-4-6-thinking-16x
Thinking mode from the MiniCPM-V 4.6 Hugging Face model card; official checkpoint; visual token compression ratio 16x; metric label in card: OCRBench.; PWC evaluation id 1115; paper: A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
verified8312026Source ↗Edit result
25videollama3-7b
DAMO-NLP-SG/VideoLLaMA3-7B-Image checkpoint; numbers from the 7B-Image model card main-results table.; PWC evaluation id 1214; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
verified8282026Source ↗Edit result
26qwen2-vl-2b
PWC evaluation id 145; paper: Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
verified8092026Source ↗Edit result
27zaya1-vl-8b
OCRBench overall score (0-1000 scale; card reports 79.8 normalised). Reported in the ZAYA1-VL-8B technical report (Zyphra). Evaluated on the Zyphra eval harness based on VLMEvalKit.; PWC evaluation id 1230; paper: ZAYA1-VL-8B Technical Report
verified7982026Source ↗Edit result
28qwen2-5-vl-3b
paper table; source label OCRBench; metric reported as Score. Imported while expanding from ScreenSpot-Pro source papers.; PWC evaluation id 5378; paper: Qwen2.5-VL Technical Report
verified7972026Source ↗Edit result
29videollama3-2b
PWC evaluation id 115; paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
verified7792026Source ↗Edit result
30minicpm-llama3-v-2-5
Paper Table 5 OCR benchmark result for MiniCPM-Llama3-V 2.5; source reports OCRBench score.; PWC evaluation id 5183; paper: MiniCPM-V: A GPT-4V Level MLLM on Your Phone
verified7252026Source ↗Edit result
§ 04 · Submit a result

Add to the leaderboard.

← Back to Leaderboards