olmOCR-Bench
Allen Institute for AI
7,010 unit tests across 1,402 PDF documents. Tests parsing of tables, math, multi-column layouts, old scans, and more.
Benchmark Stats
Models17
Papers28
Metrics9
SOTA History
Not enough data to show trend.
base
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | chandra-ocr-0.1.0 Base clean document parsing. Near-perfect | Editorial | 99.9 | 2025 | Source |
headers-footers
Higher is better
long-tiny-text
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | chandra-ocr-0.1.0 Long documents with tiny text. #1 in category | Editorial | 92.3 | 2025 | Source |
tables
Higher is better
arxiv
Higher is better
Pass Rate
Percentage of unit tests passed
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | chandra-ocr-0.1.0 7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench. | Editorial | 83.1 | 2025 | Source |
| 2 | infinity-parser-7b | Editorial | 82.5 | 2025 | Source |
| 3 | olmocr-v0.4.0 | Editorial | 82.4 | 2025 | Source |
| 4 | paddleocr-vl | Editorial | 80 | 2025 | Source |
| 5 | Qianfan-OCR Baidu Qianfan-OCR 4B (Qwen3-4B + Qianfan-ViT), Apache 2.0, 192 langs. Layout-as-Thought. | Editorial | 79.8 | 2026 | Source |
| 6 | dots-ocr-3b | Editorial | 79.1 | 2025 | Source |
| 7 | mistral-ocr-3 Estimated based on 74% win rate vs OCR 2 | Editorial | 78 | 2025 | Source |
| 8 | marker-1.10.0 | Editorial | 76.5 | 2025 | Source |
| 9 | marker-1.10.1 | Editorial | 76.1 | 2025 | Source |
| 10 | deepseek-ocr | Editorial | 75.7 | 2025 | Source |
| 11 | mineru-2.5 | Editorial | 75.2 | 2025 | Source |
| 12 | mistral-ocr-api | Editorial | 72 | 2025 | Source |
| 13 | gpt-4o-anchored GPT-4o with anchored prompting | Editorial | 69.9 | 2025 | Source |
| 14 | nanonets-ocr2-3b | Editorial | 69.5 | 2025 | Source |
| 15 | gemini-flash-2 | Editorial | 63.8 | 2025 | Source |
multi-column
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | chandra-ocr-0.1.0 Multi-column document parsing | Editorial | 81.2 | 2025 | Source |
old-scans-math
Higher is better
old-scans
Higher is better