OCR benchmark
Higher is better
| # | Model | Score | Source |
|---|---|---|---|
| ★ | DeepSeek-R1-0528 | 73.3 | codesota-api |
| 2 | Qwen3-235B-A22B | 70.7 | codesota-api |
| 3 | DeepSeek-R1 | 65.9 | codesota-api |
| 4 | DeepSeek-R1-Distill-Llama-70B | 65.2 | codesota-api |
| 5 | OpenAI o1 (Dec 2024) | 63.4 | codesota-api |
| 6 | Kimi k1.5 (long-CoT) | 62.5 | codesota-api |
| 7 | DeepSeek-R1-Distill-Qwen-32B | 62.1 | codesota-api |
| 8 | DeepSeek-R1-Distill-Qwen-14B | 59.1 | codesota-api |
| 9 | o1-mini | 53.8 | codesota-api |
| 10 | DeepSeek-V3-0324 | 49.2 | codesota-api |
| 11 | DeepSeek-R1-Distill-Qwen-7B | 49.1 | codesota-api |
| 12 | DeepSeek-R1-Distill-Llama-8B | 49 | codesota-api |
| 13 | Kimi k1.5 (short-CoT) | 47.3 | codesota-api |
| 14 | Llama 4 Maverick (17B-128E) | 43.4 | codesota-api |
| 15 | DeepSeek-V3 | 40.5 | codesota-api |
| 16 | Gemma 3 27B IT | 39 | codesota-api |
| 17 | Claude 3.5 Sonnet | 38.9 | codesota-api |
| 18 | GPT-4o | 32.9 | codesota-api |
| 19 | Llama 4 Scout (17B-16E) | 32.8 | codesota-api |
| 20 | Gemma 3 12B IT | 32 | codesota-api |
| 21 | Qwen2.5-Coder-32B-Instruct | 31.4 | codesota-api |
| 22 | Gemma 3 4B IT | 23 | codesota-api |