Comprehensive benchmark evaluating 8 OCR capabilities across 23 tasks in 31 scenarios.
View on AlphaXiv ↗Average score on Chinese private test set
Higher is better
| # | Model | Score | Source |
|---|---|---|---|
| ★ | Qwen2.5-VL-72B | 63.7 | codesota-api |
| 2 | gemini-25-pro | 62.2 | codesota-api |
| 3 | Qianfan-OCR | 60.77 | codesota-api |
| 4 | minicpm-v-4.5-8b | 58.8 | codesota-api |
| 5 | sail-vl2-8b | 57.6 | codesota-api |
| 6 | claude-3.5-sonnet | 48.4 | codesota-api |
| 7 | InternVL2.5-78B | 46.2 | codesota-api |
| 8 | Qwen2-VL-72B | 46.1 | codesota-api |
| 9 | gpt-4o-2024 | 45.7 | codesota-api |
Average score on English private test set
Higher is better
| # | Model | Score | Source |
|---|---|---|---|
| ★ | seed-1.6-vision | 62.2 | codesota-api |
| 2 | Qwen2.5-VL-72B | 61.5 | codesota-api |
| 3 | qwen3-omni-30b | 61.3 | codesota-api |
| 4 | nemotron-nano-v2-vl | 61.2 | codesota-api |
| 5 | gemini-25-pro | 59.3 | codesota-api |
| 6 | llama-3.1-nemotron-nano-vl-8b | 56.4 | codesota-api |
| 7 | Qianfan-OCR | 56 | codesota-api |
| 8 | gpt-4o | 55.5 | codesota-api |
| 9 | ovis2.5-8b | 54.1 | codesota-api |
| 10 | gemini-1.5-pro | 51.6 | codesota-api |
| 11 | sail-vl2-8b | 49.3 | codesota-api |
| 12 | minicpm-v-4.5-8b | 48.4 | codesota-api |
| 13 | Qwen2-VL-72B | 47.8 | codesota-api |
| 14 | gpt-4o-2024 | 47.6 | codesota-api |
| 15 | claude-3.5-sonnet | 47.5 | codesota-api |
| 16 | internvl3.5-14b | 47.1 | codesota-api |
| 17 | step-1v | 46.8 | codesota-api |
| 18 | InternVL2.5-78B | 45 | codesota-api |
| 19 | grok4 | 45 | codesota-api |
| 20 | gpt-4o-mini | 44.1 | codesota-api |
| 21 | claude-sonnet-4 | 42.4 | codesota-api |
| 22 | qwen2.5-vl-7b | 41.8 | codesota-api |
| 23 | deepseek-vl2-small | 41 | codesota-api |
| 24 | pixtral-12b | 38.4 | codesota-api |
| 25 | phi-4-multimodal | 38.1 | codesota-api |
| 26 | glm-4v-9b | 37.1 | codesota-api |
| 27 | molmo-7b | 33.9 | codesota-api |
| 28 | llava-ov-7b | 33.7 | codesota-api |
| 29 | idefics3-8b | 26 | codesota-api |
| 30 | mistral-ocr-2512 | 25.2 | codesota-api |
| 31 | docowl2 | 23.4 | codesota-api |
Higher is better
| # | Model | Score | Source |
|---|---|---|---|
| ★ | InternVL3-14B | 55.7 | codesota-api |
| 2 | Qwen2.5-VL-7B | 55.6 | codesota-api |
| 3 | Ovis2-8B | 49.2 | codesota-api |
| 4 | Gemini 1.5 Pro | 43.1 | codesota-api |
| 5 | DeepSeek-VL2-Small | 42.7 | codesota-api |
| 6 | Step-1V | 42.6 | codesota-api |
| 7 | MiniCPM-o-2.6 | 41.1 | codesota-api |
| 8 | Claude 3.5 Sonnet | 39.6 | codesota-api |
| 9 | GLM-4V-9B | 36.6 | codesota-api |
| 10 | GPT-4o | 32.2 | codesota-api |
| 11 | LLaVA-OneVision-7B | 17.8 | codesota-api |
| 12 | TextMonkey | 15.8 | codesota-api |
| 13 | Pixtral-12B | 14.6 | codesota-api |
| 14 | Monkey | 13.1 | codesota-api |
| 15 | Molmo-7B | 12.8 | codesota-api |
| 16 | Cambrian-1-8B | 9.90 | codesota-api |
| 17 | LLaVA-NeXT-8B | 9.10 | codesota-api |
Higher is better
| # | Model | Score | Source |
|---|---|---|---|
| ★ | InternVL3-14B | 52.6 | codesota-api |
| 2 | Gemini 1.5 Pro | 51.9 | codesota-api |
| 3 | Ovis2-8B | 47.7 | codesota-api |
| 4 | Qwen2.5-VL-7B | 46.7 | codesota-api |
| 5 | Step-1V | 46.7 | codesota-api |
| 6 | GPT-4o | 46.5 | codesota-api |
| 7 | Claude 3.5 Sonnet | 45.2 | codesota-api |
| 8 | MiniCPM-o-2.6 | 45.1 | codesota-api |
| 9 | DeepSeek-VL2-Small | 43.3 | codesota-api |
| 10 | GLM-4V-9B | 42.6 | codesota-api |
| 11 | Pixtral-12B | 40.3 | codesota-api |
| 12 | LLaVA-OneVision-7B | 36.4 | codesota-api |
| 13 | Cambrian-1-8B | 34.7 | codesota-api |
| 14 | Molmo-7B | 34.5 | codesota-api |
| 15 | LLaVA-NeXT-8B | 31.5 | codesota-api |
| 16 | TextMonkey | 23.9 | codesota-api |
| 17 | Monkey | 23.1 | codesota-api |